Core Concepts
Major AI developers should provide legal and technical safe harbors to protect public interest safety research from account suspensions or legal reprisal.
Abstract
1. Abstract:
- Independent evaluation and red teaming are crucial for identifying risks posed by generative AI systems.
- Prominent AI companies deter model misuse through terms of service, hindering good faith safety evaluations.
- Proposal for major AI developers to provide legal and technical safe harbors for public interest safety research.
2. Introduction:
- Generative AI systems raise concerns for misuse, bias, hate speech, privacy issues, and more.
- Leading AI companies lack transparency and access into their systems, hindering independent evaluation.
- Terms of service restrict independent evaluation, leading to account suspensions for researchers.
3. Challenges to Independent AI Evaluation:
- AI companies' terms of service discourage community-led evaluations.
- Companies lack transparency in enforcement processes, limiting independent evaluation.
- Existing safe harbors protect security research but not other good faith research.
4. Safe Harbors:
- Proposal for legal safe harbor to protect researchers from legal action for good faith research.
- Proposal for technical safe harbor to prevent account suspensions for good faith research.
- Recommendations for companies to delegate access authorization to trusted third parties.
5. Related Proposals:
- Prior calls for expanding independent access for AI evaluation and red teaming.
- Governments' suggestions for independent evaluation and red teaming in AI systems.
Stats
AI 개발자들은 공개적인 이해관계 연구를 보호하기 위해 법적 및 기술적 안전 지역을 제공해야 합니다.
AI 회사들의 이용 약관은 독립적인 평가를 방해하고 계정 정지를 유발합니다.
기업들은 신뢰할 수 있는 제3자에게 연구 접근 권한을 위임하여 참여를 확대해야 합니다.
Quotes
"We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal." - Authors
"The gaps in the policy architectures of leading AI companies force well-intentioned researchers to either wait for approval from unresponsive access programs, or risk violating company policy and potentially losing access to their accounts." - Authors