Large language models exhibit vulnerabilities in safety alignment when faced with code inputs, highlighting the need for improved safety mechanisms to address novel domains.