Development of a Methodology for the Application of Generative Neural Networks in Creating 3d Models

Bondareva, N.A.; Bondarev, A.E.; Andreev, S.V.; Ryzhova, I.G.

doi:10.26583/sv.17.3.03

Scientific Visualization, 2025, volume 17, number 3, pages 25 - 34, DOI: 10.26583/sv.17.3.03

Development of a Methodology for the Application of Generative Neural Networks in Creating 3d Models

Authors: N.A. Bondareva¹, A.E. Bondarev², S.V. Andreev³, I.G. Ryzhova⁴

Keldysh Institute of Applied Mathematics RAS, Moscow, Russia

¹ ORCID: 0000-0002-7586-903X, nicibond9991@gmail.com

² ORCID: 0000-0003-3681-5212, bond@keldysh.ru

³ ORCID: 0000-0001-8029-1124, esa@keldysh.ru

⁴ ORCID: 0000-0003-1613-3038, ryzhova@gin.keldysh.ru

Abstract

The article considers the current scientific and technical problem of integrating generative neural network architectures into the process of automated 3D modeling. Despite significant progress in this area, existing solutions are often characterized by insufficient transparency and limited capabilities of deterministic control by design engineers. In this regard, the concept of an innovative hybrid methodological approach based on the synergistic interaction of intelligent natural language processing systems and verified engineering software packages is proposed. The purpose of the proposed approach is to significantly increase the efficiency and accuracy of the design process by minimizing the likelihood of errors and ensuring the possibility of prompt adjustment at all stages of creating 3D models. The methodology is based on the integration of AI capabilities in the field of semantic analysis and generation of variable design solutions with existing CAD modeling algorithms. The results of experimental verification of the proposed concept are presented, demonstrating a significant reduction in the time spent on creating 3D models compared to traditional methods, which indicates the promise of the developed approach for practical application in engineering activities.

Keywords: 3d modeling, Computer-aided design (CAD), Generative neural networks, Autostereoscopic monitor.

Introduction

Neural generative networks are currently being actively implemented in various areas of human activity and have a significant impact on work processes, while demonstrating impressive capabilities in creating new data similar to that on which they were trained, from generating realistic images and videos to creating music and writing texts. Generative networks have demonstrated their ability for creativity and innovation beyond traditional algorithms. In the field of design, for example, they are used to create new product concepts, architectural designs, and fashion collections [1-3].

In addition to entertainment and media industries, generative neural networks are actively being implemented in high-tech industries such as banking, law, and industrial production. In medicine, research is underway on the use of neural network technologies for analyzing medical images and developing personalized treatment plans.

However, despite the impressive potential, the application of generative neural networks in manufacturing tasks, especially in computer-aided design (CAD), faces serious challenges [4]. Although neural networks are potentially capable of generating design documentation and 3D models, their "black box" - the unpredictability and opacity of the generation process - becomes a significant and dangerous limitation.

Unlike traditional CAD methods, where each design step is controlled and documented, neural networks often produce results without a clear explanation of how they were obtained, which makes it difficult to validate the results, find and correct errors, and make adjustments during the modeling process. This makes it difficult to trust the results and limits their use in critical projects that require strict quality control and compliance with standards. As a result, despite the potential to automate and accelerate the design process, the implementation of generative neural networks in CAD production processes requires solving the “black box” problem and developing methods to ensure deterministic control and clarity of the generation process.

This study proposes a concept of a hybrid methodological approach designed to overcome these limitations. The approach is based on the synergy of natural language processing (NLP) and verified engineering software packages. It is assumed that the combination of these two approaches will minimize the likelihood of errors and inaccuracies in the design process, while ensuring the necessary level of control by specialists.

The proposed methodology is based on the integration of the capabilities of artificial intelligence systems in the field of natural language processing and the rapid generation of variable solutions with existing algorithms for constructing CAD models in domestic automated design systems such as KOMPAS-3D [5] and TeFlex [6].

The validity of the proposed approach is confirmed by the fact that the exclusive use of neural network technologies does not guarantee the consideration of all design features and does not provide the possibility of manual correction of the identified discrepancies. During the experimental verification of the concept, the open software platform Blender was used, where basic three-dimensional geometric structures were synthesized by integrating two technological approaches: a sphere, a cube, a cone and a gear wheel. Each geometric object was characterized by the possibility of parametric modification.

Thus, this study aims to develop and verify a hybrid methodological approach to automated design that combines the capabilities of AI and traditional CAD systems. The results of the study can contribute to improving the efficiency and accuracy of the design process, as well as expanding the control capabilities of specialists.

Methodology

The proposed methodology is a hybrid approach to automated 3D modeling, combining natural language processing (NLP) with the use of proven engineering software packages (CAD), such as KOMPAS-3D or TeFlex. This approach is aimed at minimizing errors and increasing the accuracy of the modeling process compared to using exclusively generative neural networks. The key advantage is the verification of the parameters of the script generated by the AI, instead of checking the entire generated model.

Instead of directly using a neural network to generate a 3D model, which is fraught with hidden errors, text AI is used to create a control script in a programming language compatible with the selected CAD system. This allows shifting the focus of control from checking the finished model to verifying the parameters specified in the script, ensuring earlier detection and correction of potential errors. The iterative nature of the process involves adjusting the prompt and script based on the analysis of intermediate results, which ensures flexibility and high accuracy of the final 3D model.

During the research, an approximate algorithm was developed that allows applying a similar approach in practice. Its scheme is presented in Fig. 1.

Fig. 1. Algorithmic diagram of the described methodology

1) Problem Statement: Clearly define the requirements for the 3D model, including functionality, geometric parameters and constraints.

2)Prompt generation: Formulate a text query ( prompt ) for the text AI describing the desired 3D part, taking into account the results of step 1.

3) Script generation: Using text AI to generate control code (script) in a programming language compatible with the selected CAD system (e.g. Python for Blender, programming languages supported by KOMPAS-3D or TeFlex).

4) Script processing in CAD: Run the generated script in a CAD system to automatically build a 3D model.

5) Correction and validation: Analysis of the obtained 3D model by a specialist. Making necessary corrections to the prompt or script based on the analysis of the results. Repeating stages 3-5 until a satisfactory result is achieved.

6) Save Model: Save the finished 3D model in a suitable format for further use (e.g. for stereoscopic display).

This methodology offers a compromise between automation and controllability of the 3D modeling process, combining the advantages of AI and proven engineering tools. Positive factors include such as reduced modeling time due to process automation, but unlike the use of special neural networks for 3D modeling, here it is possible to control the process and immediately make the necessary adjustments. The entry threshold for users without deep knowledge in programming for performing similar production tasks is also reduced, at the same time, the value of advanced specialists only increases, since validation and verification tasks require in-depth professional knowledge and extensive experience.

However, the use of such a methodology poses a new set of problems that must be taken into account when working. Among them:

∙ dependence on the quality of text AI and the correctness of the prompt;

∙ the need for basic knowledge in working with the selected CAD system;

∙ the possibility of errors in the generated script;

∙ limited to text description only when formulating a task, which creates inconvenience for engineers.

Experimental results

It should be noted that there are two approaches to 3D modeling: manual modeling ( Manual Modeling ), which allows you to create more unique and detailed objects, and scripted modeling ( Scripted Modeling ), which is more focused on automating the modeling process and creating parametric models. Script modeling is suitable for creating complex structures or a series of similar objects with varying parameters that are difficult to model manually.

Script modeling is used in the following areas:

∙ creation of parametric models (machine parts, architectural elements);

∙ generation of complex structures and patterns;

∙ automation of the modeling process;

∙ projects requiring high precision and reproducibility .

Script modeling is characterized by the need to have programming skills in addition to modeling skills. Programming such a script requires knowledge of the basics of syntax and understanding of object construction algorithms; the time for debugging the program and testing its operation should also be taken into account.

A simple task like drawing primitives can take a novice programmer 1-2 hours to write a program and debug it. A professional developer with experience writing scripts in Python or JavaScript can complete the same simple task in 10 to 30 minutes. The proposed approach allows you to reduce the time it takes to create a separate script and increase work efficiency.

During the experimental verification of the concept, the open software platform Blender [7] was used, where, through the integration of two technological approaches, basic three-dimensional geometric structures were synthesized: a sphere, a cube, a cone and a gear wheel.

Blender was chosen as a platform for conducting preliminary experiments due to the following factors:

1) Open source and freely available: Using Blender eliminates licensing restrictions and provides unlimited access to the source code;

2) Python scripting support;

3) Rich API and large community: Blender's well-documented API and active developer community provide access to a wide range of tools and libraries, making it easy to integrate with external systems and expand your experimentation capabilities;

4) Blender provides parametric modeling tools, which allows you to create 3D models with variable parameters set through scripts. This is essential for testing the hypothesis about the influence of parameters set by text AI on the characteristics of the generated models;

5) Blender provides a wide range of tools for subsequent manual editing and modification of 3D models, which allows you to analyze the results and make the necessary adjustments.

The method of automated modeling using programming based on the execution of program code (script) was applied. Within the framework of this approach, geometric modeling is carried out by means of automatic execution of program code (script) generated by an external system. This approach demonstrates high productivity in generating primitive geometric shapes and relatively simple composite objects, the geometry of which is completely determined by a set of input parameters.

This approach also makes it easy to vary the specified parameters in various combinations (Fig. 2).

Fig. 2. An example of using the control code for modeling a 3D figure (gear)

Initially, the figure was defined by the description of the following set of parameters (Table 1):

Table 1. Set of forming parameters for the 3D model of a toothed gear.

def create_gear_2_82(

teeth=12, # quantity teeth

radius=0.1, # radius (10cm)

thickness=0.02, # thickness (2cm)

tooth_depth = 0.02, # depth tooth (2cm)

tooth_width =0.02 # width tooth (2cm)

Based on the results of the script execution, the program generated a gear model, shown in Fig. 3.

This approach provides the possibility of operational software modification of the generated 3D model. As an illustration, Fig. 4 shows an example of parametric modification of the cone geometry (the parameters are given in Table 2). It consists of performing a Boolean subtraction operation, as a result of which a segment constituting a quarter of the volume of the original cone is removed.

Fig. 3. Visualization of the “gear” part

Table 2. Set of forming parameters for the 3D model of a cone.

# Creating a cone

# Base radius = 0.1 meter (10 cm)

# Depth (height) = 0.2 meters (20 cm) - can be changed as desired

bpy.ops.mesh.primitive_cone_add (

radius 1=0.1, # base radius in meters

radius 2=0, # radius of the top (0 for a sharp cone)

depth =0.2, # cone height

location =(0, 0, 0.1) # location (raised by half height so that the base is at the level of the grid) )

Fig. 4 Cone with a cut out segment

Transition to stereo

Visualization using stereoscopic technologies significantly enhances the viewer's perception of depth and spatial characteristics of an object, bringing its presentation closer to real perception. The key advantage of this method is the creation of conditions that are as close as possible to natural visual perception.

The project, conducted at the Keldysh Institute of Applied Mathematics of the Russian Academy of Sciences, investigates methods for creating stereoscopic representations of the results of scientific research. Two stereoscopic systems are used for the experiments: a classic setup and an autostereoscopic Dimenco monitor, which provides viewing of a stereo image without special glasses. The autostereoscopic monitor allows the formation of an integrated image, including multiple projections of an object, thereby expanding the range of viewing angles. A detailed description of the autostereoscopic display technology is given in [8-9].

Fig. 5 shows a sample of a composite stereoscopic frame constructed using the multi-view method (on the right) and a separately enlarged image allowing one to examine details (on the left). The left image is purely illustrative and is not part of the stereo frame.

Fig 5. Results of hybrid gear modeling (main frame + multi-view representation)

Also shown in stereo was a modified cone model with a removed segment that made up a quarter of the model's volume (Fig. 6).

Fig 6. Results of hybrid modeling of a cone with a cut segment (main frame + multi-view view )

The complex geometry of 3D models, characterized by the presence of curved surfaces, is of considerable interest for studies of stereoscopic perception, since the subjective perception of depth and shape can vary widely depending on many factors, including individual characteristics of the visual system and the physical dimensions of the object.

Discussion

In this paper, a new hybrid approach to automated 3D modeling was presented, combining natural language processing (NLP) and traditional CAD methods. This approach, based on AI-based script generation and subsequent processing in specialized software, demonstrates a number of advantages and disadvantages that need to be discussed to assess its practical applicability and prospects for further development.

On the one hand, the proposed method demonstrates significant potential for improving the efficiency and accuracy of the 3D modeling process. Automation of script generation based on text descriptions significantly reduces the time required to create basic models, and the ability to check script parameters at early stages allows minimizing errors and increasing the reliability of the final result. The flexibility of the iterative process, which allows you to adjust both text queries and the generated code, makes this approach adaptable to various tasks and requirements. A specialist can focus on adjusting and improving the model, rather than on the routine creation of basic geometry. As the system is used and the results are adjusted, it is possible to train AI to improve the quality of the generated scripts.

However, it is necessary to recognize certain limitations. The quality of the generated scripts directly depends on the quality of the text request (prompt), which requires certain skills and understanding of the capabilities of AI from the user. In addition, the need for manual correction and validation of results limits the degree of full automation of the process. There may also be errors in the generated code that require specialist intervention. Finally, the applicability of this approach may be limited by the capabilities of the CAD systems and programming languages used. There are also problems with scalability at this stage: the approach may be effective for creating individual parts, but its capabilities for complex assemblies and projects may be limited.

The proposed hybrid approach is compared with existing approaches to 3D modeling using neural networks, which can be roughly divided into several categories.

1. Fully generative models [10]: These models, such as PointNeRF, GAN- based models (e.g. StyleGAN for 3D), use neural networks to generate 3D models directly from noise or latent space. Their advantages include high generation speed, the ability to create new unique shapes. However, in their case, the "black box" problem becomes pronounced. It is difficult to control the generation process and make adjustments, so the quality of the models can be unpredictable, and the final post-processing can take as much time as designing from scratch. This method is suitable for use in areas where precision and attention to detail are not required.

2. 2D to 3D transformation based models [11]: These models use neural networks to transform 2D images (or multiple images) into 3D models. Examples include multi-view image based or sketch based methods. This makes it possible to create 3D models from available 2D data (photographs, drawings). However, the quality of the 3D models is highly dependent on the quality and quantity of the 2D data. It can be difficult to obtain accurate geometry and detail. The "black box" problem is also present.

The proposed hybrid approach occupies an intermediate position between fully generative models and scripted modeling. It uses text-based AI to transform informal requirements into parameters, which are then used to control the generation of a 3D model. Once generated, the model can be manually refined.

There are a number of promising areas for further research and improvement. Here are some of them:

∙ development of specialized text AI for CAD;

∙ automation of prompt engineering: development of algorithms that automatically generate optimal prompts based on specified requirements for a 3D model;

∙ integration with knowledge bases and ontologies;

∙ development of APIs for data exchange between neural networks and CAD systems: creation of standardized APIs that allow easy integration of neural networks with existing CAD systems;

∙ interactive model editing: Developing interfaces that allow professionals to interactively edit generated models using CAD tools and automatically update scripts;

∙ and integration with systems that recognize drawings: this will allow the creation of a mixed prompt based on a graphic request and a text description.

Conclusion

The article discusses a hybrid approach to 3D modeling that combines natural language processing and traditional CAD methods, which demonstrates a promising combination of automation and controllability. The methodology of the approach is described. The methodology is tested on real modeling problems. The results of work on construction on an autostereoscopic monitor using a multi-view representation are presented.

Despite the identified limitations related to the quality of input data and the need for manual adjustments, the method showed significant potential for accelerating and increasing the accuracy of the 3D model creation process, especially for parametric tasks. Further research should be aimed at improving the quality of script generation, automating the validation process, and expanding compatibility with various CAD systems. The results obtained open up new opportunities for increasing the efficiency and accessibility of 3D modeling for a wide range of users.

References

1. J. Ho, A. Jain, P. Abbeel , Denoising Diffusion Probabilistic Models, 2020, https://doi.org/10.48550/arXiv.2006.11239 (date appeals 03/29/2023)

2. C. Meng , Y. He, Y. Song, J. Song, J. Wu, J. Zhu, S. Ermon , SDEdit : Guided Image Synthesis and Editing with Stochastic Differential Equations, 2022, https://doi.org/10.48550/arXiv.2108.01073

3. Radford A., Jong WK, Hallacy C., Ramesh A., Goh G., Agarwal S., Sastry G., Askell A., Mishkin P., Clark J., Krueger G., Sutskever I. 2021. Learning Transferable Visual Models From Natural Language Supervision. arXiv preprint arXiv:2103.00020 [cs.CV]. https://doi.org/10.48550/arXiv.2103.

4. Bondareva N.A. Graphic neural networks and image verification problems // Proceedings of the 33rd International Conference on Computer Graphics and Machine Vision GraphiCon 2023, V.A. Trapeznikov Institute of Control Sciences of the Russian Academy of Sciences, Moscow, Russia, September 19-21, 2023, pp . 317-327, DOI : 10.20948/ graphicon -2023-317-327

5. KOMPAS-3 D Russian import-independent system of three-dimensional design. URL : https://kompas.ru/ (date of access 29.04.2025)

6. T - FLEX CAD Russian engineering software for 3D design and development of design documentation URL : https :// www . tflexcad . ru / (date of access 04/29/2025)

7. Blender URL : https://www.blender.org/ (accessed 29.04.2025)

8. Andreev SV, Bondareva NA Constructing a representation of textual information in stereo presentations // Proceedings of the 28th International Conference of Computer Graphics and Vision GraphiCon-2018. — TUSUR Publishing — Tomsk, 24–27 September 2018 — P. 86–89.

9. Andreev SV, Bondareva NA, Bondarev AE Expansion of the Functions of the Multi-View Stereomaker Software for Automatic Construction of Complex Stereo Images // Scientific Visualization. — 2021. — Vol. 13 - N 2 - P. 149-156. DOI: 10.26583/sv.13.2.10

10. Masterpiece Studio URL: https://masterpiecestudio.com/ (date appeals 04/29/2025)

11. Kaedim AI-powered Art Outsourcing URL: https://www.kaedim3d.com/ (date appeals 04/29/2025)

Scientific Visualization

Open Access Electronic Journal