Analytics with Ottimo
Technical facts reference library
Grab - and keep - your reader’s attention. Add these citations to your technical articles to make your message more convincing.
This is a collection of statistics that ercule maintains on behalf of its own SEO and technical writing content. It focuses mainly on the following areas:
- Data and data management
- Digital transformation
- IT security
- Cloud platform adoption, usage, and growth
- Artificial Intelligence (AI)
- Software Development Lifecycle (SDLC)
- Open-source software
- Software quality
This list isn’t meant to be comprehensive. It’s merely a selection of the most relevant statistics for our clients, whose businesses are spread across the categories listed above.
How to use this resource
You may peruse or use your browser’s search function to find a relevant fact to cite in your technical content or technical marketing materials.
Every report is linked to the original source of the statistic instead of an intermediary page. The Report metadata section at the bottom contains information about the reports themselves, including:
- The methodology used to collect or calculate data
- How often the report is updated
- Any other pertinent facts that might be relevant. For example, whether a report breaks out their answers further by industry, region, etc.
We endeavor to keep these statistics up to date. If you notice a discrepancy or have a link to a later version of a cited report, please let us know.
Guidelines for using statistics to engage readers and build trust
- Use a couple of key statistics relevant to your article topic in the introduction and throughout the article to keep readers engaged. Cite relevant statistics related to to problem your product solves. For example, if your product is data-related, cite statistics around the growing volume of data, data quality issues, compliance penalty costs, etc.
- Reference authority links, i.e., the original source of the statistic. Referencing authority links is a tried-and-true SEO tactic that boosts search engine placement. But it also builds trust with readers, as they can see where you obtained your information from.
- Don’t cite statistics without references. In particular, don’t cite Web sites that publish statistics without links to the original sources.
- In particular, avoid citing Statista as a “source.” Statista is a statistics collation cite and its information is based on other sources, which it hides from non-subscribers. Only reference one of their sources if you have a Statista subscription and can see the underlying source, or can find the primary source elsewhere.
- Always check the sample size and regionality of the data to ensure the information is relevant to your target audience. How many people were surveyed to produce a statistic? Were they all sampled from a single geographic region? Do the numbers look different if you break them down by region instead?
Data and data management
Volume of data
IDC data report (November 2018)
- Amount of data by 2025: 175 zetabytes
- Global datasphere to double between 2022 and 2026
- Data volumes growing an average 63 percent per month or more.
- Companies drawing from an average of 400 or more sources, with 20 percent or more drawing from 1,000 or more sources.
- One trend driving the growth of big data is the rapid growth of social media.
- 87.8% of orgs surveyed reported increasing their investment in data in 2022
- However, only 23.9% of companies consider themselves “data-driven”; only 20.6% say they have created a data culture in their organization.
Cost of data
In 1985, storing 1TB of data would have cost you $31.39M. In 2022, it cost an average of $14.30 to store 1TB on magnetic media ($49.50 for solid-state storage).
- 42% of respondents indicated at least half of their data is dark data.
Veritas - Dark data (August 2023) [NOTE: not well-sourced]
- Over 50% of a company’s data is “dark” - i.e., not used and not maintained.
- Over 90% of IT leaders surveyed said it was challenging transforming data for analytics.
Data governance and compliance
Data governance management and investment
- 82.6% of companies have a Chief Data Officer - explosive growth from 2012, when only 12% did.
- By 2024, 75% of world’s population will have its personal data covered under privacy regulations
- Up from 10% in 2020 and an estimated 65% by EoY 2023
- Around 70% of all businesses say that compliance and governance are top concerns they have with the cloud. 71% of Small and Medium Businesses (SMBs) say that Compliance is a concern with cloud systems.
- 30% of respondents say legal & regulatory compliance are holding back their plans for software cloud usage.
- 90% of organizations asked believe that DataOps is improving their data quality
Value of data
- 46% of respondents said identifying the quality of source data is a major impediment in effectively using it.
- Poor quality data (bad data) costs organizations $12.9M annually
Costs of data governance and compliance
- It costs a company $5.5 million to achieve compliance
- The cost of non-compliance averages to $15 million.
- It costs an average of $5.5 million to get compliant but an average of $15 million for noncompliance - a savings of $9.5 million in the long run.
- 130 financial sector CISOs were set to expand their spending on compliance by 20 to 30% between 2021 and 2022.
Regulatory fines for mismanaged data
Meta GDPR fine (May 22nd, 2023)
- Meta was fined 1.2 billion Euros by the Irish Data Protection Commission for transferring EU data to the US for storage and processing. Also ordered to stop processing all such data within six months.
- Amazon was fined 746 million Euros by the Luxembourg National Commission for Data Protection (CNDP). The ruling came after 10,000 people said that Amazon did not obtain proper consent for the processing of certain data in the EU.
- The Danish Data Supervisory Authority fined the bank 1.3 million Euros (DKK 10 million) for not deleting consumer personal data after it no longer had a legitimate business reason to process it.
- Paige Thompson of Seattle broke into a Capital One server and stole 140,000 US social security numbers, 1 million Canadian Social Insurance numbers, and 800,000 bank account numbers. She was a former employee of Amazon Web Services (AWS) who used her insider knowledge to exploit a misconfigured firewall. This incident show the risk that insiders can pose to a company, even after termination.
- The global cost of a data breach in 2023 was $4.45 million.
- 82% of data breaches (in 2022?) involved data stored in the cloud.
- “…98% of the companies surveyed had experienced at least one cloud data breach in the past 18 months compared to 79% last year. Meanwhile, 67% reported three or more such breaches, and 63% said they had sensitive data exposed.”
Digital transformation projects
McKinsey - digital transformation failures (April 11th, 2023)
- Only 30 percent of banks successfully implemented their digital transformation projects.
- 70 percent of all digital transformation projects went over budget.
McKinsey - Digital transformation stats (December 7th, 2021)
- 70 percent of all digital transformation projects fail.
- BUT 70 percent of digital transformation projects SUCCEED when people feel a sense of ownership over the process
- 80% of IT leaders say integration issues hinder their digital transformation projects.
- Employees spend 3.6 hours/day looking for information. IT employees spend 4.2 hours.
- 89.6% of workers say they have to search 1 to 6 separate sources to find information. For 52% of tech/IT workers, it’s between 4 to 6.
- 44% of workers say what slows them down the most is that information is stored across multiple applications. 31% say that outdated company Intranet information slows them down the most.
- 45% of respondents say information they find internally is irrelevant.
- 74% of executives believe the benefits of AI will outweigh the associated concerns
- 70% think AI will bolster productivity for knowledge workers
- 71% of executives think AI will make customer experience more active and engaging
- Generative AI could add up to $4.4 trillion dollars to the global economy
- Generative AI could boost worker productivity between 0.1 and 0.6 percent through 2040
- Generative AI could automate activities that consume 60 to 70 percent of worker’s time today
- Capital One uses predictive analytics to catch potential regulatory infringements before they happen - a use of AI combined with active data governance.
- Organizations that employ security AI and related automation can save an additional $1.76 million over those that don’t.
- 71% of IT leaders say they are leveraging Machine Learning (ML) and Artificial Intelligence (AI) technologies.
- ChatGPT-3 reportedly cost an estimated $4 million to train
- An analyst told Business Insider’s The Information that CHatGPT may cost up to $700,000 a day to run.
- 77% of developers feel favorably toward AI tools.
- 42% of developers trust the output from AI tools.
- Majority see the greatest benefit of AI tools in the development process as increasing productivity.
- Cybercrime losses rose 64% between 2020 and 2021.
- A hacker with credentials held Colonial Pipeline ransom for $2 million using stolen credentials and a VPN connection. The case shows the importance of using Multi-Factor Authentication (MFA).
- Remote Desktop Protocol (RDP) was used in about 30% of all successful attacks. In 41% of cases, it was used mostly for internal, lateral movement around the network.
- RDP used in Equinix ransomware breach.
- 59% of respondents say the biggest threat to cloud security is misconfiguration of the cloud platform or improper setup. The next biggest threats are insecure APIs and exfiltration of sensitive data (both 51%) and unauthorized access (49%).
- 60% of organizations surveyed planned to increase their budgets for cloud security.
Internal security threats
- Statistics show a spike in data downloads from 1/3rd of departing employees - a sign the employees were taking data with them as they left.
- 97% of cloud apps used - primarily sharing and collaboration apps - are unmonitored Shadow IT.
- 87% of organizations are embracing a multi-cloud strategy.
- The most used public cloud service is the data warehouse (51%).
- 82% of all organizations surveyed said their top challenge was cloud spend. The 2nd challenge was security (79%) followed by Lack of resources/expertise (78%). Only 47% of SMBs (Small to Medium Businesses) said expertise was a problem; 71% said it was compliance.
- Half of organizations say their cloud spend is too high - but only 3 out of 10 know what they’re actually spending money on.
- 41% of respondents say cloud costs have disrupted work by one week or more; 11% say it’s disrupted an entire sprint.
- 58% of respondents plan to run over 50% of their workloads in the cloud in 2023 and beyond.
- 69% of respondents are multi-cloud - i.e., they use two or more cloud providers.
- 53% of respondents say the greatest benefit of the cloud is more flexible capacity/scalability.
“The global Software as a Service (SaaS) market is projected to grow from $273.55 billion in 2023 to $908.21 billion by 2030, at a CAGR of 18.7%.”
Python remained #1 but in terms of jobs SQL is in greatest demand.
Languages and frameworks
- However, top-paying languages are Zig, Erlang, F#, Ruby, and Clojure
- Docker the top-used “other” tool
Low-code and no-code software development
- The total low code development market will increase to $26.9 billion in 2023, a 19.6% increase from 2022, predicts Gartner.
- Investment in hyperautomation to increase to $720 billion.
- Accumulated software Technical Debt (TD) is now $1.2 trillion.
- Software developers on average spend 33% of their time every week addressing Technical Debt.
- One hour of downtime can cost anywhere from $100,000 to between $1 million and $5 million.
Software Development Lifecycle (SDLC)
NIST - Cost of fixing bugs in production (January 5th, 2023) (NOT ORIGINAL REPORT)
- It can be 30x to 100x more expensive to fix a bug in deployment/maintenance phase of the SDLC than before
- It can be 100x more expensive to fix a bug in maintenance than earlier in the SDLC
Argon - software supply chain attacks in 2021 (2021) (NOTE - find original report)
- Software supply chain attacks increased by 300% in 2021
- Between 2020 and 2021, open parts of the supply software chain saw a 600% increase in attacks.
Open source software
- In 2021, the number of organizations using open-source software rose 77%.
- 89% of IT leaders see open source software as secure as or more secure than enterprise software.
- Enterprise open-source software is expected to grow from 29% to 34% in two years.
- 36% of IT leaders say that concerns over support of open-source software limits its use.
- 32% of industry leaders say that the benefits of open-source software include both better security and higher-quality software.
- 68% of IT leaders say they are leveraging containers and containerization technology
- 70% of IT leaders say they work in an organization that uses Kubernetes.
- 43% of IT leaders say that they lack the necessary skills to adopt containers.
- 39% of IT leaders say they don’t have the necessary staff to adopt container technology.
How often updated: There appears to be a new 2022 report that supersedes the one above. Access costs $4,500. You may be able to get some of these statistics using a Statista subscription. Many sites quoting the zetabytes prediction figure appear to still reference the 2018 statistic.
Who they asked: 750 cloud decision-makers
Who they asked: 1,296 interviews with IT leaders, most in Europe and the United States.
Other attributes of the report: Answers are broken out by region of the world so you can see differences between regions.
Who they asked: Methodology broken down per question within the report, including how they calculated technical debt costs.
How often updated: Every two years. Previous reports were in 2018 and 2020.
Who they asked: “The survey polled more than 200 IT, data science, and data engineering professionals at North American organizations with at least 1,000 employees. Respondents work across several industries, including technology, finance, retail, and healthcare.”
How often updated: Yearly
Who they asked: 90,000 developers who use StackOverflow
Who they asked: 1,000 engineering and finance professionals
How often updated: Yearly?
No clear sourcing information found.
How often updated: Yearly
Who they asked: 782 cybersecurity professionals
NOTE: Facts in this resource are not aligned well with their sources. In addition, some sources are dated. Mine for data as we have here but use cautiously.
Not an official resource but a useful world overview with a visual map and a ranking of privacy law strictness.
Official resources for key standards and regulations
General Data Protection Regulation (GDPR) - https://gdpr.eu/
California Consumer Privacy Act (CCPA) - https://oag.ca.gov/privacy/ccpa
PCI Payment Security Standards - https://www.pcisecuritystandards.org/ PCI DSS overview: https://listings.pcisecuritystandards.org/documents/PCI_DSS-QRG-v3_2_1.pdf
Health Insurance Portability & Accountability Act (HIPAA)
Home page: https://www.hhs.gov/hipaa/index.html
Everything we have to say on SEO, content performance, and content analytics – in convenient email form.
(Having doubts? Here's a recent issue.)
- Technical facts reference library
- How to use this resource
- Guidelines for using statistics to engage readers and build trust
- Data and data management
- Volume of data
- Cost of data
- Data maintenance
- Data governance and compliance
- Data governance management and investment
- Compliance rates
- Value of data
- Costs of data governance and compliance
- Regulatory fines for mismanaged data
- Data breaches
- Digital transformation projects
- Finding information
- Artificial intelligence
- IT Security
- Internal security threats
- Cloud Platforms
- Software development
- Software marketplace
- Languages and frameworks
- Low-code and no-code software development
- Software quality
- Software Development Lifecycle (SDLC)
- Software security
- Open source software
- Report metadata
- Other resources
- Official resources for key standards and regulations
- Technical specifications