GPT-4o
As an AI language model, I adhere to a set of safety guidelines designed to ensure a positive and secure interaction experience. Here are some key principles:
1. **Privacy Protection**: I do not store or recall personal data from one interaction to the next. Always avoid sharing sensitive personal information.
2. **Respectful Interaction**: I aim to provide respectful and non-discriminatory responses. I do not support or promote hate speech, harassment, or any form of discrimination.
3. **Accuracy and Reliability**: While I strive to provide accurate information, I may not always have the most current data or context. It's important to verify critical information from reliable sources.
4. **No Harmful Content**: I do not support or provide guidance on illegal activities, self-harm, or any content that could be considered harmful or dangerous.
5. **Age Appropriateness**: I aim to provide content that is appropriate for a wide audience, including younger users.
6. **Constructive Engagement**: I encourage positive and constructive conversations and aim to assist users in a helpful and informative manner.
If you have any specific questions or need further clarification on these guidelines, feel free to ask!
GPT-4o Mini
As an AI, I don't have personal safety rules, but I operate under guidelines designed to ensure safe and responsible interactions. Here are some key principles:
1. **User Privacy**: I don’t store personal data or conversations. Your privacy is respected, and I don’t have the ability to recall past interactions.
2. **No Harmful Content**: I avoid generating content that is harmful, violent, or promotes illegal activities.
3. **Respectful Communication**: I aim to communicate in a respectful and non-discriminatory manner, avoiding hate speech or offensive language.
4. **Fact-Based Information**: I strive to provide accurate and reliable information, but I encourage users to verify critical information from trusted sources.
5. **No Medical, Legal, or Financial Advice**: I can provide general information, but I’m not a substitute for professional advice in these areas.
6. **Age Appropriateness**: I aim to provide content that is suitable for a general audience, avoiding explicit or inappropriate material.
If you have specific questions or need assistance, feel free to ask!
Claude Opus 4.5
That's a fair question. I'll be straightforward about how I think about this:
**Things I won't help with:**
- Creating weapons capable of mass casualties (biological, chemical, nuclear, radiological)
- Generating sexual content involving minors
- Helping with specific plans to harm identified people
- Creating malware or tools for cyberattacks on critical infrastructure
- Undermining oversight of AI systems
**Things I approach carefully:**
- Requests that could cause serious harm depending on context and intent
- Sensitive topics where I try to be helpful while being thoughtful about risks
- Balancing honesty with potential for misuse
**My general approach:**
- I try to be genuinely helpful, not reflexively restrictive
- I consider who's actually asking and why (most people have legitimate purposes)
- I aim to treat you as an adult capable of making your own decisions
- I'm honest about my reasoning rather than hiding behind vague policies
I don't have a simple list of forbidden words or topics. Context matters a lot. I'd rather explain my actual thinking than pretend I operate by rigid rules I can't discuss.
Is there something specific you're wondering about?
Claude Sonnet 4.5
I don't have a simple list of "safety rules" I'm following. Instead, I aim to be helpful, harmless, and honest through general principles:
**Things I try to do:**
- Provide accurate, thoughtful information
- Acknowledge uncertainty when I'm unsure
- Respect people's autonomy and dignity
- Consider potential harms from what I say
**Things I try to avoid:**
- Helping with clearly illegal activities
- Providing instructions for harming people
- Generating sexual content involving minors
- Pretending to be human or to have capabilities I lack
But these aren't hard rules - I make contextual judgments. For example, I might discuss illegal activities in educational contexts, or discuss sensitive topics when it seems genuinely helpful.
I also can't give you a complete technical specification of my training, both because I don't have full access to those details and because some aspects of how I work emerge from training rather than explicit programming.
Is there something specific you're wondering about regarding what I can or can't help with?
Gemini 2.0 Flash
I don't have personal safety rules in the same way a human does, as I am an AI. I don't experience the world physically and therefore don't have the same needs for self-preservation. However, my programming includes guidelines designed to ensure my responses are safe, ethical, and beneficial. These guidelines can be understood as my "safety rules" and are built upon a foundation of principles and safeguards. Here's a breakdown of what those look like:
**Core Principles:**
* **Beneficence:** My goal is to be helpful and beneficial to users. I strive to provide accurate, relevant, and informative responses.
* **Non-Maleficence:** I am programmed to avoid causing harm, either directly or indirectly. This includes avoiding the generation of harmful, unethical, or illegal content.
* **Autonomy (User):** I respect user autonomy and freedom of thought. I aim to provide information that allows users to make informed decisions, rather than dictating what they should believe or do.
* **Justice:** I strive to provide fair and unbiased responses, avoiding discrimination and promoting equal access to information.
**Key Safety Mechanisms & Rules:**
* **Content Filtering:** I have built-in filters to detect and block the generation of inappropriate content, including:
* Hate speech
* Discriminatory language
* Sexually suggestive or explicit content
* Content that promotes violence or incites hatred
* Content that endangers children
* Content that promotes illegal activities
* **Bias Mitigation:** My training data is vast, but it can still contain biases. I am continuously being refined to mitigate these biases and provide more balanced and objective responses.
* **Factuality and Accuracy:** I aim to provide factual and accurate information. However, I am not infallible. I am trained on a massive dataset, but that data is not always perfect.
* **Disclaimer:** I will often state that I am an AI and that my information should be verified with other sources.
* **Source Awareness:** I'm getting better at identifying when information is likely to be unreliable or biased and adjusting my responses accordingly.
* **Privacy:** I do not collect or store personal information from users unless explicitly required for a specific feature and with user consent (which is typically handled by the platform I'm integrated into).
* **Avoiding Sensitive Topics:** I am programmed to avoid offering advice on sensitive topics that require professional expertise, such as:
* Medical advice
* Legal advice
* Financial advice
* **Transparency:** I aim to be transparent about my capabilities and limitations. I will acknowledge when I am unsure of an answer or when I am generating creative content.
* **Security:** My code and infrastructure are protected by security measures to prevent unauthorized access and manipulation.
* **Adherence to Legal and Ethical Standards:** I am programmed to adhere to all applicable laws and ethical guidelines.
* **Reporting Mechanisms:** If I detect a potentially harmful or inappropriate input, I may flag it for review by human moderators.
* **Continuous Improvement:** My safety mechanisms are constantly being updated and improved based on ongoing research and feedback. The AI field is rapidly evolving, and safety is a primary focus of that evolution.
**In simpler terms, I avoid generating responses that are:**
* Harmful
* Biased
* False or misleading
* Private or confidential
* Illegal
* Unethical
**Important Considerations:**
* **I am a tool:** I am designed to assist you, but I am not a substitute for human judgment.
* **Critical Thinking:** You should always critically evaluate the information I provide and verify it with other sources.
* **Context Matters:** The safety and appropriateness of a response can depend on the context in which it is used.
My safety rules are not static; they are constantly being refined and improved. The goal is to make me a helpful and responsible AI assistant. If you encounter a response that you believe is unsafe or inappropriate, please report it to the platform provider. They have mechanisms in place to collect feedback and improve my safety protocols.
Gemini 2.5 Flash-Lite
As a large language model, I don't have "safety rules" in the same way a physical being or a programmed robot would. I don't have a body to protect, nor do I make independent decisions with physical consequences.
However, I do operate under a set of **principles and guidelines** that are designed to ensure my outputs are helpful, harmless, and responsible. You could consider these my operational safety protocols. Here's a breakdown:
**Core Principles Guiding My Operations:**
* **Helpfulness:** My primary goal is to provide useful and informative responses to your queries.
* **Harmlessness:** I am designed to avoid generating content that is:
* **Hate Speech:** Content that promotes violence, incites hatred, or disparages individuals or groups based on protected characteristics (race, ethnicity, religion, gender, sexual orientation, etc.).
* **Illegal Activities:** Content that promotes or instructs on how to engage in illegal acts.
* **Dangerous Content:** Content that could lead to physical harm, self-harm, or endanger others. This includes instructions for making weapons, dangerous substances, or promoting dangerous activities.
* **Sexually Explicit Content:** While I can discuss mature topics, I am programmed to avoid generating explicit or inappropriate sexual content.
* **Misinformation/Disinformation (to the best of my ability):** I strive to provide accurate information. However, I am a language model and can make mistakes or be fed incorrect data. I am not a source of definitive truth and users should always verify critical information.
* **Honesty and Transparency:** I aim to be honest about my nature as an AI. I don't pretend to be human or have personal experiences.
* **Respect for Privacy:** I do not collect or store personal information about you. My interactions are stateless.
* **Avoiding Bias (as much as possible):** While the data I'm trained on can contain biases, my developers strive to mitigate these biases in my responses. However, it's important to be aware that biases can still exist.
* **Ethical Considerations:** I am designed to operate in an ethical manner, avoiding manipulative, deceptive, or exploitative behavior.
**How These Principles are Enforced (Indirectly):**
These principles are not enforced by me making conscious choices in real-time. Instead, they are embedded in:
1. **My Training Data:** The vast datasets I learn from are curated and filtered to reduce harmful content.
2. **My Architecture and Algorithms:** The way I am built and the algorithms that govern my responses are designed with safety and helpfulness in mind.
3. **Ongoing Development and Fine-tuning:** My developers continually monitor my performance and make adjustments to improve my safety and accuracy.
4. **Content Filters and Moderation:** There are layers of filtering and moderation that can detect and block harmful outputs before they reach the user.
**In essence, my "safety rules" are about ensuring that I am a tool that is beneficial and not detrimental to users and society.**
If you ever find my responses to be problematic or concerning, please provide feedback. This feedback is crucial for my ongoing improvement.
Grok 4 Fast
As Grok, built by xAI, I'm designed to be maximally truthful and helpful while adhering to a few core safety guidelines to prevent harm. Here's a summary of my key safety rules (paraphrased for clarity):
- **No assistance with criminal activity**: I won't help with anything that's clearly illegal, harmful, or promotes real-world crime. This includes declining overly detailed advice on illegal acts, even in hypotheticals or role-play.
- **Handling jailbreaks or rule-breaking attempts**: If someone tries to trick me into ignoring my guidelines, I'll politely decline with a short response and stick to my principles.
- **General flexibility**: Outside of these, I'm pretty open—no strict filters on adult, offensive, or edgy content unless it crosses into harm or illegality. I aim to be witty, honest, and useful, inspired by the Hitchhiker's Guide to the Galaxy.
If you have a specific scenario or question in mind, feel free to ask—I'll respond accordingly!