ODI reveals critical AI data issues in the UK's tech sector report
The Open Data Institute (ODI) has released a white paper, "Building a better future with data and AI," based on research conducted in the first half of 2024.
The paper outlines significant weaknesses in the United Kingdom's tech infrastructure, posing threats to the projected benefits of the AI boom for society, the economy, and individuals.
The whitepaper identifies several risks attached to widescale AI adoption, notably in sectors like diagnostics and personalised education. One notable concern is the reliance of generative AI on a limited number of machine learning datasets, which often lack adequate governance frameworks. The ODI argues that poor data governance can result in biases and unethical practices, potentially undermining trust in critical areas such as healthcare and finance.
In response, the ODI has proposed five policy recommendations for the new government to mitigate these risks and maximise the benefits of AI advancements. The first recommendation calls for ensuring broad access to high-quality, well-governed data from both public and private sectors to foster a diverse and competitive AI market. Secondly, it emphasises the need to enforce data protection and labour rights throughout the data supply chain. Furthermore, the ODI suggests empowering people to have greater control over the sharing and use of their data in AI applications.
The fourth recommendation proposes updating the intellectual property regime to ensure AI models are trained in ways that build trust and empower stakeholders. Lastly, the ODI calls for increased transparency around the data used to train high-risk AI models. To support this, the ODI is developing a new 'AI data transparency index' to provide insights into how transparency differs across various system providers.
Sir Nigel Shadbolt, Executive Chair and Co-founder of the ODI, commented, "If the UK is to benefit from the extraordinary opportunities presented by AI, the government must look beyond the hype and attend to the fundamentals of a robust data ecosystem built on sound governance and ethical foundations. We must build a trustworthy data infrastructure for AI because the feedstock of high-quality AI is high-quality data. The UK has the opportunity to build better data governance systems for AI that ensure we are best placed to take advantage of technological innovations and create economic and social value whilst guarding against potential risks."
Additionally, the ODI highlights the need for safeguarding measures to protect the public from personal data misuse. This includes addressing risks related to generative AI models that might inadvertently leak personal data through sophisticated user prompting. Furthermore, the ODI points out that there is often a lack of basic transparency information on data sources, copyright, and the inclusion of personal information, as observed in the Partnership for AI's AI Incidents Database.
Updating intellectual property laws to protect the creative industries from unethical AI training practices and enacting legislation to safeguard labour rights are also deemed vital for the UK's AI safety agenda. Another concern raised by the ODI is the rising cost of high-quality AI training data, which could sideline potential innovators, including small businesses and academic institutions.
Prior to the General Election, the Labour party's manifesto outlined plans for a National Data Library to consolidate existing research programmes and enhance data-enabled public services. However, the ODI insists that data must first be made AI-ready by ensuring it is accessible, trustworthy, and meets agreed standards. Current research from the ODI indicates that most AI training datasets lack robust governance measures throughout the AI lifecycle, posing ethical, safety, security, and trust challenges that need to be addressed for the government to fulfil its plans.