toplogo
Sign In

Applying Systems Engineering V-Model to Address Collaboration Challenges in Building Machine Learning-Enabled Software


Core Concepts
The V-Model from Systems Engineering provides an effective approach to address the interdisciplinary collaboration challenges encountered when building machine learning-enabled software systems.
Abstract
This paper explores the application of the Systems Engineering V-Model to address the collaboration challenges in building machine learning (ML)-enabled software systems. The key insights are: Requirement Engineering: System-level requirements should be created and actively maintained to keep up-to-date with new requirement changes, with the participation of owners of ML and non-ML components. The V-Model's clear system boundaries and responsibilities help ensure consistent requirements are defined across the system, subsystems, and components. Architecture, Design, and Implementation: System-level architecture design with elements, interfaces, responsibilities, alternatives, and expected performances should be created and actively maintained. Risks such as design changes or improvements due to uncertainty in ML components must be actively identified and mitigated. The V-Model's enforcement of validation and verification (V&V) and risk management helps address these challenges. Model Development: Requirements and detailed design of ML components with interfaces, alternatives, and expected performances should be created, with participation of owners of external and internal components like data and infrastructure. The V-Model's clear boundaries and responsibilities ensure these aspects are properly defined and reviewed. Data Engineering: Data should be treated as a separate component with standalone requirements, design synthesis, and system validation (data validation and monitoring). The V-Model's component-level focus enables proper attention to data quality and evolution. Quality Assurance: V&V at both system, subsystem and component levels (ML, non-ML, data, infrastructure) should be enforced with identified owners. The V-Model's consistency checks across system levels help discover issues for quality assurance. Process: The software development lifecycle for ML-enabled systems should follow layered decomposition of systems, subsystems, and components, with continuous in-process V&V and risk management. The V-Model's clear boundaries and responsibilities address the ad-hoc nature of ML development processes. Organization, Teams, and Responsibility: Documentation at the system, subsystem, and component levels should be created, approved, and tracked, with consolidated terminology understood by all roles. The V-Model's inclusive documentation and access control help bridge the knowledge gap between different roles. Overall, the study found that despite requiring additional efforts, the characteristics of the V-Model align effectively with several collaboration challenges encountered when building ML-enabled systems. Future research should investigate new process models that leverage the V-Model's strengths.
Stats
"To train a new ML model, we need to backfill the data in the past few weeks to be used in training, and it requires non-trivial work and is burdensome." (P5) "Engineers had to do multiple iterations to fix several data issues and finally get the correct data they needed for model training." (P3) "The quality of data (team's wiki documents) was so low that the ChatGPT output using the low-quality data turned out to work poorly." (P3)
Quotes
"SDEs are more emphasized on the coding standards. Scientists are more focused on model accuracy and are less focused on coding standards and code comments, class interface definitions, etc. It's a problem for scientists that how to improve their coding standard in ML model development, and who will own and maintain the code of the model." (P6) "it's not clear if the root cause of the ticket is in the ML component or non-ML component." (P1, P2) "it would help a lot if there were metrics, monitoring, or tools on ML and non-ML components to distinguish whether the issue is from the ML model or not." (P1, P2)

Deeper Inquiries

How can the V-Model be adapted to better balance the trade-offs between the additional effort required and the benefits it provides for building ML-enabled systems?

The V-Model can be adapted to better balance the trade-offs by incorporating more flexibility and agility into its structure. One approach could be to introduce iterative cycles within each phase of the V-Model, allowing for incremental development and feedback loops. This would enable teams working on ML-enabled systems to adapt to changes more effectively and address uncertainties as they arise. Additionally, streamlining documentation processes and emphasizing clear communication channels between different roles can help reduce the additional effort required while maximizing the benefits of using the V-Model. By focusing on continuous validation and verification throughout the development process, teams can ensure that the system meets the desired requirements and quality standards without compromising on efficiency.

How can the V-Model be integrated with agile methodologies to create a hybrid approach that combines the rigor of systems engineering with the responsiveness of iterative development for ML-enabled systems?

Integrating the V-Model with agile methodologies can create a hybrid approach that leverages the strengths of both frameworks. One way to achieve this integration is by aligning the phases of the V-Model with the iterative sprints of agile development. Each sprint can focus on a specific aspect of the system, such as requirements gathering, design, implementation, testing, and validation, following the V-Model's structured approach. This allows for a systematic and disciplined process while also enabling quick feedback and adaptation to changes, characteristic of agile methodologies. By incorporating agile principles such as collaboration, flexibility, and customer feedback into the V-Model framework, teams can benefit from the best of both worlds in terms of rigor and responsiveness in developing ML-enabled systems.

What new process models can be developed that leverage the strengths of the V-Model, such as clear system boundaries and responsibilities, while addressing its limitations around flexibility and adaptability to changes?

One potential new process model that can leverage the strengths of the V-Model while addressing its limitations is a "Flexible V-Model" that incorporates elements of agile methodologies. This model could maintain the clear system boundaries and responsibilities defined by the V-Model while introducing iterative and adaptive practices to enhance flexibility and adaptability. By allowing for incremental development, continuous feedback loops, and regular reassessment of requirements, the Flexible V-Model can better accommodate changes and uncertainties in ML-enabled systems. Additionally, incorporating risk management strategies and prioritizing communication and collaboration among team members can help mitigate the rigidity of the traditional V-Model and enhance its ability to respond to evolving project needs. This hybrid approach aims to strike a balance between structure and agility, optimizing the development process for ML-enabled systems.
0