Latest news with #dataPreparation
Yahoo
20-06-2025
- Business
- Yahoo
INOD's DDS Momentum Surges: Is the Trend Set to Continue?
Innodata's INOD Digital Data Solutions (DDS) segment continues to anchor its growth as enterprises increasingly seek partners that offer full lifecycle support for Generative AI. With trust and safety becoming critical for AI adoption, Innodata is positioning DDS not just as a data preparation engine but as a platform provider that enables safe, scalable GenAI the first quarter of 2025, DDS segment revenues surged 158% year over year to $50.8 million, supported by new client wins and expanded relationships with Big Tech. The company secured $8 million in new deals from four major technology customers, while previously smaller accounts are scaling into multi-million-dollar recently launched its Generative AI Test & Evaluation Platform, a new suite designed to help enterprises assess the safety and reliability of large language models (LLMs). Built on NVIDIA's NIM microservices, the platform supports hallucination detection, adversarial prompt testing and domain-specific risk benchmarking across text, image, audio and video inputs - helping organizations build more trustworthy platform, which is currently in early access with MasterClass as the first charter customer, is expected to monetize in the second half of 2025. With growing demand for evaluation and trust infrastructure around GenAI, Innodata expects the platform to deepen enterprise adoption and support recurring revenue momentum across its DDS segment. Innodata's DDS business faces competition from emerging AI service providers like BBAI and Grid Dynamics GDYN, both of which are expanding their GenAI capabilities across evaluation and enterprise offers decision intelligence and model validation tools designed to assess AI performance and risk in mission-critical environments such as defense and healthcare. While provides advanced analytics for model reliability, it lacks Innodata's specialized focus on multimodal hallucination detection and adversarial red-teaming tailored to large language Dynamics supports full-cycle GenAI adoption, helping Fortune 100 clients with custom LLM development, prompt engineering and deployment. With strong enterprise relationships, Grid Dynamics competes on the AI transformation front. However, unlike Innodata, it doesn't offer a ready-to-deploy platform for model safety testing and benchmarking. INOD shares have jumped 17.2% year to date, while the broader Zacks Computer & Technology sector returned 1.5% and the Zacks Computer - Services industry grew 2.2%. Image Source: Zacks Investment Research Innodata stock is trading at a premium, with a forward 12-month Price/Sales of 5.49X compared with the Computer Services industry's 1.81X. INOD has a Value Score of F. Image Source: Zacks Investment Research The Zacks Consensus Estimate for second-quarter 2025 earnings is pegged at 11 cents per share, unchanged over the past 30 days, implying a 50% decline from the previous quarter. Innodata Inc. price-consensus-chart | Innodata Inc. QuoteThe consensus mark for Innodata's fiscal 2025 earnings is pegged at 69 cents per share, which has remained unchanged over the past 30 days. The figure marks a decline of 22.47% from fiscal 2024's currently carries a Zacks Rank #3 (Hold). You can see the complete list of today's Zacks #1 Rank (Strong Buy) stocks here. Want the latest recommendations from Zacks Investment Research? Today, you can download 7 Best Stocks for the Next 30 Days. Click to get this free report Innodata Inc. (INOD) : Free Stock Analysis Report Grid Dynamics Holdings, Inc. (GDYN) : Free Stock Analysis Report Holdings, Inc. (BBAI) : Free Stock Analysis Report This article originally published on Zacks Investment Research ( Zacks Investment Research Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data


Geeky Gadgets
16-05-2025
- Geeky Gadgets
The Only Data Cleaning Framework You Need : From Chaos to Clarity
Imagine this: you've just received a dataset for an urgent project. At first glance, it's a mess—duplicate entries, missing values, inconsistent formats, and columns that don't make sense. You know the clock is ticking, but diving in feels overwhelming. Sound familiar? Here's the truth: unclean data is the silent killer of good analysis. Even the most sophisticated algorithms or visualizations can't save you if the foundation—your data—is flawed. That's why mastering the art of data cleaning isn't just a nice-to-have skill; it's essential. And while the process can seem daunting, there's good news: a simple, structured framework can transform chaos into clarity. Enter the CLEAN framework—the only methodology you'll ever need to tackle data cleaning with confidence and precision. Christine Jiang explains how the CLEAN framework simplifies the complexities of data preparation into five actionable steps. From identifying solvable issues to documenting your decisions, this approach ensures your datasets are not only accurate but also transparent and ready to deliver actionable insights. Along the way, you'll discover why data cleaning is an iterative process and how to balance perfection with practicality. Whether you're a seasoned data analyst or just starting out, this framework will empower you to approach messy datasets with a clear plan and purpose. Because in the world of data, the quality of your analysis is only as good as the quality of your preparation. So, how do you turn 'good enough' data into great decisions? Let's explore. What Is the CLEAN Framework? The CLEAN framework is a practical and systematic methodology designed to simplify the complexities of data preparation. Each step offers clear guidance to help you identify, resolve, and document data issues effectively. Below is a detailed breakdown of the five steps: Conceptualize the data: Begin by understanding the dataset's structure, key metrics, dimensions, and time grain. This foundational step ensures you have a clear grasp of what the data represents and how it aligns with your analytical objectives. Begin by understanding the dataset's structure, key metrics, dimensions, and time grain. This foundational step ensures you have a clear grasp of what the data represents and how it aligns with your analytical objectives. Locate solvable issues: Identify common problems such as inconsistent formats, null values, duplicates, or nonsensical entries. Use tools like filters, pivot tables, and logical checks to systematically pinpoint these issues. Identify common problems such as inconsistent formats, null values, duplicates, or nonsensical entries. Use tools like filters, pivot tables, and logical checks to systematically pinpoint these issues. Evaluate unsolvable issues: Not all problems can be resolved. Document missing data, outliers, or violations of business logic that cannot be fixed, and assess their potential impact on your analysis. Not all problems can be resolved. Document missing data, outliers, or violations of business logic that cannot be fixed, and assess their potential impact on your analysis. Augment the data: Enhance your dataset by adding calculated metrics, new time grains (e.g., weeks or months), or additional dimensions like geographic regions. This step increases the dataset's analytical flexibility and depth. Enhance your dataset by adding calculated metrics, new time grains (e.g., weeks or months), or additional dimensions like geographic regions. This step increases the dataset's analytical flexibility and depth. Note and document: Maintain a detailed log of your findings, resolutions, and any unresolved issues. This ensures transparency and serves as a valuable reference for future analysis. Why Data Cleaning Is an Iterative Process Data cleaning is rarely a one-time task. Instead, it is an iterative process that involves refining your dataset layer by layer. The focus should be on making the data suitable for analysis rather than striving for unattainable perfection. This iterative approach saves time and ensures that your efforts are aligned with the dataset's intended purpose. Each pass through the data allows you to uncover and address new issues, gradually improving its quality and usability. How to Apply the CLEAN Framework To effectively implement the CLEAN framework, follow these actionable steps: Perform sanity checks: Review data formats, spelling, and categorizations to ensure consistency and accuracy. Review data formats, spelling, and categorizations to ensure consistency and accuracy. Identify patterns or anomalies: Use filters, pivot tables, and visualizations to detect irregularities or inconsistencies in the data. Use filters, pivot tables, and visualizations to detect irregularities or inconsistencies in the data. Validate relationships: Conduct logical checks to confirm relationships between variables, such as making sure that order dates precede shipping dates. Conduct logical checks to confirm relationships between variables, such as making sure that order dates precede shipping dates. Preserve raw data: Avoid overwriting the original dataset. Instead, create new columns or tables for cleaned data to maintain the integrity of the raw data. Avoid overwriting the original dataset. Instead, create new columns or tables for cleaned data to maintain the integrity of the raw data. Document decisions: Record every action you take, including unresolved issues, to maintain transparency and accountability throughout the process. Watch this video on YouTube. Here is a selection of other guides from our extensive library of content you may find of interest on Data cleaning. Dealing with Unsolvable Data Issues Not all data problems have straightforward solutions. For example, missing values or anomalies may lack a reliable source of truth. When faced with such challenges, consider the following strategies: Document the issue: Clearly note the problem and its potential impact on your analysis to ensure transparency. Clearly note the problem and its potential impact on your analysis to ensure transparency. Avoid unjustified imputation: Only fill in missing data if the method can be justified with strong business logic or external validation. Only fill in missing data if the method can be justified with strong business logic or external validation. Communicate limitations: Share unresolved issues with stakeholders to ensure they understand any constraints or limitations in the analysis. Enhancing Your Dataset Once your data is cleaned, consider augmenting it to unlock deeper insights and improve its analytical value. This can involve: Adding time grains: Introduce new time intervals, such as weeks, quarters, or fiscal years, to enable trend analysis and time-based comparisons. Introduce new time intervals, such as weeks, quarters, or fiscal years, to enable trend analysis and time-based comparisons. Calculating metrics: Create new metrics, such as average order value, customer lifetime value, or time-to-ship, to provide more actionable insights. Create new metrics, such as average order value, customer lifetime value, or time-to-ship, to provide more actionable insights. Integrating additional data: Enrich your dataset with external information, such as demographic data or regional sales figures, to support more nuanced and comprehensive analysis. Best Practices for Professional Data Cleaning To ensure a smooth and professional data cleaning process, adhere to these best practices: Preserve data lineage: Maintain a clear record of both the original and cleaned datasets to track changes and ensure reproducibility. Maintain a clear record of both the original and cleaned datasets to track changes and ensure reproducibility. Prioritize critical issues: Focus on resolving problems that have the greatest impact on your key metrics and dimensions. Focus on resolving problems that have the greatest impact on your key metrics and dimensions. Emphasize transparency: Document every step of your process, including assumptions, limitations, and decisions, to build trust in your analysis and assist collaboration. Key Takeaways for Data Analysts Data cleaning is a foundational skill for any data analyst, and the CLEAN framework provides a structured approach to mastering this critical task. By following its five steps—conceptualizing, locating, evaluating, augmenting, and noting—you can systematically address data issues while maintaining transparency and accountability. Remember, the process is as much about thoughtful documentation and systematic problem-solving as it is about technical execution. With consistent practice, you can transform messy datasets into reliable tools for analysis, paving the way for impactful and data-driven insights. Media Credit: Christine Jiang Filed Under: Top News Latest Geeky Gadgets Deals Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.