techUK
Printable version |
Taking full advantage of generative AI benefits: balancing innovation and data protection
As the UK positions itself as a global leader in AI, it is crucial to create a regulatory environment that fosters innovation and public trust.
The potential benefits offered by Generative AI (GenAI) are immense. Through supporting innovation and enhancing productivity, it has the potential to contribute up to £120 billion annually to the UK economy.1 Beyond its economic impact, genAI can also drive societal progress, fo6r example through improving healthcare outcomes and enhancing education and learning experiences.
However, to unlock these benefits, and to help the UK position itself as a global leader in AI, a clear, practical regulation is essential—one that supports innovation while safeguarding individual rights and robust data protection measures. Ensuring this balance is essential for fostering public trust, protecting rights, and ensuring the safe deployment of GenAI.
The ICO’s consultation series
The Information Commissioner’s Office (ICO) has recently conducted a five-part consultation series on how UK data protection law applies to genAI, closing on 18 September 2024. This series marks an important step toward greater regulatory clarity in this rapidly evolving area.
While the final guidance is still in development, the consultation series has laid important groundwork for organisations developing and deploying AI systems. Once finalised, it is expected to provide valuable insights to help organisations take full advantage of genAI’s potential, while ensuring robust data protection compliance.
techUK has actively engaged throughout this consultation process, facilitating a roundtable discussion between our members and the ICO, and providing detailed responses to each chapter, emphasising the need for a balanced approach that fosters innovation while safeguarding individuals' rights.
The consultation series explored five key areas of data protection in genAI, with the ICO analysis setting out the following:
- The first chapter tackled a key question: can companies legally use data scraped from the web to train AI models? The ICO suggested this could be permitted under the 'legitimate interests' principle of data protection law, as long as this is balanced with appropriate safeguards. Companies would need to clearly justify why they need the data, prove web scraping is necessary, and demonstrate that their proposed use appropriately balances organisational and individual interests.
- The second chapter focused on purpose limitation, advising organisations to clearly define and communicate their reasons for using personal data at each stage of AI development process - from collecting training data to deploying AI applications. This helps ensure data is only used as intended and as people would reasonably expect.
- The third chapter addressed accuracy principle in data protection law, making a distinction between the levels of accuracy needed for different uses. Some applications, such as when the model is used for creative tasks – may not require that it produces 100% factually accurate information. However, accuracy becomes crucial when the AI's outputs can impact people directly, such as in decision-making or generating factual information, to avoid negative consequences for individuals, such as misinformation or harm to reputations. Developers and deployers must carefully consider how the quality of the training data may impact the outputs and ensure that any limitations in accuracy are clearly communicated to users.
- The fourth chapter sets out how genAI developers and deployers can ensure individuals are able to exercise their data protection rights throughout the AI lifecycle, like accessing, correcting, or requesting deletion of their data, from training data collection to model deployment, including when personal data is collected indirectly, such as through web scraping.
- The final chapter tackled how to allocate responsibility for data protection across the AI supply chain. With multiple organisations often involved in developing and deploying AI models, the ICO emphasised the need to clearly define each organisation's responsibilities. The guidance included practical examples to help organisations determine their roles and obligations.
Key themes from techUK’s response
Throughout our engagement, we have emphasised the need for pragmatic regulation that enables innovation while ensuring effective safeguards, including:
Regulatory flexibility: We emphasise the need for a pragmatic and adaptive regulatory framework that can keep pace with the rapid pace of AI development, the variety of potential applications, and the iterative nature of model development. Overly rigid interpretations of data protection principles could stifle innovation and prevent the realisation of AI's full potential. Therefore, we particularly welcome the ICO's recognition of context-specific requirements – such as different accuracy standards for creative versus decision-making applications – which demonstrates the kind of flexible approach needed.
Fostering transparency: Transparency is key to maintaining public trust in AI and we agree with the need for clear communication from companies about how they are using personal data, including what types of data are being collected and for what purposes. However, organisations need flexibility in how they communicate about their AI systems, particularly during development stages where detailed disclosure could reveal sensitive business information. In such instances, a layered approach could be considered, with companies, for example, providing a more generalised explanation of data usage purposes that would provide meaningful information to individuals while avoiding disclosing sensitive details that could be exploited by competitors.
Supporting organisations of all sizes: The regulatory framework should be accessible and practicable for both large corporations and SMEs, to promote inclusive innovation in the sector, and drive economic growth. techUK’s polling for the Seven Tech Priorities report highlights that larger businesses often see more opportunities in AI and have a higher level of readiness compared to smaller businesses.2 Therefore, easily interpretable regulation will be vital to ensuring that SMEs are not left behind, as larger companies are generally better equipped to meet compliance requirements. This includes clear guidance, scalable compliance measures, and appropriate support mechanisms to ensure organisations of all sizes can participate in and benefit from AI innovation while meeting their data protection obligations.
Legitimate interests as a legal basis: We welcome the ICO's recognition of legitimate interests as a potential lawful basis for genAI development, provided organisations can demonstrate the necessity of data processing and implement appropriate safeguards. This provides crucial legal certainty while ensuring proper protections.
The value of open source: Open-source models represent one important approach to advancing AI innovation, enabling a broad range of organisations to participate in AI development. For example, recent research from Harvard University and the University of Toronto suggests there could be significant societal value from open-source infrastructure, estimated at $8.8 trillion. This approach can support transparency, as developers can examine model components directly, potentially helping identify and address security considerations. It may also help expand the UK's AI talent pool and support environmental improvements through shared innovations in computational efficiency. While different business models and approaches each bring their own advantages to the AI ecosystem, open-source can play a valuable role alongside other development approaches in fostering innovation.
Employing technical solutions: We note that there are a number of technical solutions that can help protect individual rights while enabling AI innovation, including privacy-enhancing technologies, and robust input/output filters offer practical solutions for managing personal data. Similarly, machine unlearning capabilities and data provenance solutions can help address individual rights, while technical measures ensure appropriate accuracy levels. We set out more information in the table below.
Privacy enhancing technologies (PETs): various privacy enhancing techniques can effectively anonymise or pseudonymise personal data, thereby mitigating privacy concerns. These include synthetic data, which creates artificial datasets mirroring real-world patterns without personal information; federated learning, which can be used to train models on multiple local devices without directly sharing the underlying data, but only the model’s learnings reflecting the insights gained from local data. Other techniques include differential privacy, which adds controlled “noise” to training data, significantly reducing the probability of individual information being identified; and input/output filters can be used to detect and block sensitive information from entering or leaving the system. Data provenance & watermarking: data provenance tools create an audit trail showing how AI-generated content was created and modified, while watermarking can be used to clearly mark content as AI-generated. These technologies can add valuable transparency that helps build trust and ensure responsible development. Retrieval-Augmented Generation (RAG): RAG can be used to connect models to external verified sources and databases (such as a company's knowledge base, or an external), during use. This ensures that models have access to up-to-date and relevant information tailored to the organisation's needs, improves accuracy but also reduces privacy risks by limiting the amount of sensitive data that is included in the model's training. Machine unlearning: machine unlearning could present an innovative solution to the challenges associated with data removal in genAI models. Traditionally, removing problematic content required a complete rebuild of the entire model, manually excluding the data, and retraining it. This technique would allow models to selectively forget specific information or data influences without the need to retrain the entire model. Research into these tools is ongoing, however, it could be used to facilitate compliance with data protection regulations, such as the right to be forgotten under the GDPR. |
Looking ahead
As the UK positions itself as a global leader in AI development, it is crucial to create a regulatory environment that fosters both innovation and public trust. The ICO’s consultations have laid important groundwork for ensuring that. techUK looks forward to continued collaboration with industry stakeholders, regulators, and policymakers to ensure that genAI drives growth and innovation while maintaining robust data protection standards.
Original article link: https://www.techuk.org/resource/taking-full-advantage-of-generative-ai-benefits-balancing-innovation-and-data-protection.html