7 things an SME should never upload to ChatGPT
ChatGPT can help an SME, but not everything can be pasted into a chat. Here are 7 types of data not to upload without rules, anonymization and controls.
Gaetano Castaldo
TL;DR
ChatGPT can be very useful for an SME: writing emails, summarizing documents, preparing drafts, analyzing text, speeding up operational tasks.
The problem is not using ChatGPT.
The problem is using it as if it were a private notepad, pasting company data into it without knowing which settings are active, which plan is in use and which information is leaving the company perimeter.
Without a prior assessment, an SME should avoid uploading to ChatGPT:
- personal data of customers, employees and vendors;
- CRM exports and customer lists;
- real contracts, NDAs and price quotes;
- passwords, tokens, API keys and credentials;
- HR documents and employee information;
- legal information, litigation and internal reports;
- source code, logs and production configurations.
You do not need to ban AI. You need a simple policy: which tools are authorized, which data can be used, what must be anonymized, when a Business or Enterprise environment is needed and who decides in doubtful cases.
The problem I am seeing in companies right now
Over the past few months, talking with companies and small teams, I am seeing a bit of everything.
People pasting CRM exports, contracts, customer emails, internal documents, employee data, sales offers and confidential information into ChatGPT. They often do it from personal accounts, without knowing whether data sharing for model improvement is active, without a company policy and without asking themselves whether that data can really leave the company perimeter.
The point is not to demonize ChatGPT. I use it too, and when used well it can be a huge tool for an SME.
The problem is that many companies are treating it as a smart notepad, not as an external service that company data is sent to.
And here comes a word that still feels too strong for many contexts: data breach.
Not every wrong prompt is automatically a notifiable violation. It depends on the data, the tool, the plan used, the settings, the contractual agreements and the risk to the people involved.
But if an employee uploads personal data, HR information, health data, credentials, contracts or confidential documents into an unauthorized tool, the matter cannot be dismissed with: "well, it was just ChatGPT".
It must at least be assessed.
Today, this mindset is still missing in many SMEs. There is a lot of talk about AI, productivity and automation, but much less about a simpler question:
which data can I really put into an AI tool and which not?
This is the question that should come before the prompt.
ChatGPT is not always the same ChatGPT
Before talking about what not to upload, one point must be clear: ChatGPT is not always "the same ChatGPT".
A free personal account does not offer the same level of control as a Business, Team or Enterprise environment configured by the company.
OpenAI states that in ChatGPT Business, Enterprise and API plans, company data is not used to train the models by default. This is an important point, because it reduces one of the most discussed risks when talking about business use of AI.
But it is not enough.
Having a better plan does not mean you can upload anything without judgment. You still need to know:
- which data is processed;
- who can access the tool;
- which settings are active;
- which logs and controls are available;
- whether a DPA or data processing agreement exists;
- which business processes authorize that use.
Even the provider's certifications do not replace this assessment. They can say a lot about the service's security posture, but they do not automatically certify that your way of using that tool is correct, proportionate and compliant.
The point is simple: the security declared by the vendor does not replace the company's internal governance.
Why "data breach" is not an exaggerated phrase
In everyday language, "data breach" sounds enormous. It sounds like something that only concerns multinationals, hacker attacks, stolen databases and press releases.
In reality, for the GDPR and for the Italian Data Protection Authority, the topic also concerns the unauthorized disclosure of or access to personal data.
So you do not need to imagine a spectacular cyberattack. Sometimes the problem starts from a much more trivial action: pasting personal data into an unapproved tool, with no organizational basis, no assessment and no control.
I am not saying every improper use of ChatGPT is automatically a notifiable data breach.
I am saying that if personal or special-category data ends up in the prompt, the company must have the maturity to treat it as an event to be assessed. Not as a harmless slip.
1. Personal data of customers, employees and vendors
Names, emails, phone numbers, addresses, tax codes, bank details, identity documents, billing data.
These are details an SME handles every day. That is exactly why they easily end up in prompts.
A typical example:
"Analyze this customer list and tell me who I should contact first."
If it contains names, emails, numbers, sales notes and interaction history, you are not just asking the AI for advice. You are transferring personal data and company information to an external service.
Better to do this:
- remove names and contacts;
- use fake IDs;
- work on aggregated data;
- delete unnecessary columns;
- use synthetic examples when possible.
The practical rule is this: if ChatGPT can help you even without knowing who the person is, do not give it the person's identity.
2. CRM exports and customer lists
A CRM export is far more sensitive than it seems.
It does not only contain contact records. It often contains open deals, opportunity value, closing probability, contact history, customer problems, sales notes, objections, discounts, mentioned competitors and strategic information.
For an SME, the CRM is part of its commercial assets.
Uploading it into ChatGPT without rules is risky for three reasons:
- it contains personal data;
- it contains confidential information;
- it reveals how the company sells, evaluates and manages customers.
This holds even when the file looks like "just an Excel".
Better to do this: prepare an anonymized version, remove identifying fields, work on aggregated segments and ask for analysis on patterns, not on individual recognizable customers.
A correct example:
"Analyze this anonymized dataset with industry, company size, pipeline stage and days since last contact. Tell me which patterns indicate the risk of a stalled opportunity."
Here the AI works on the process, not on the names.
3. Real contracts, NDAs and price quotes
Many use ChatGPT to simplify a contract, explain a clause, improve a sales proposal or rewrite an offer.
The use in itself can make sense.
The problem is uploading the full document with names, amounts, confidential terms, penalties, discounts, margins, negotiated clauses and customer references.
A contract is not just a text. It is a snapshot of a commercial relationship.
A price quote is not just a PDF. It often contains strategy, pricing, margins, positioning and terms that should not leave company control.
Better to do this:
- remove customer and vendor names;
- replace real amounts with fake values;
- delete unnecessary sensitive clauses;
- ask for support on a structure or a generic example;
- have the legal topics validated by a professional.
ChatGPT can help make a text clearer. It should not become the place where you upload real contracts without control.
4. Passwords, API keys, tokens and credentials
Here the rule is simple: never.
Passwords, access tokens, API keys, FTP credentials, connection strings, production secrets, .env files, certificates and private keys must not be pasted into ChatGPT.
This is not a "privacy" topic. It is operational security.
If a key is exposed, it should be considered compromised.
Better to do this:
- replace real values with placeholders;
- use examples like
API_KEY_EXAMPLEorDB_PASSWORD_HERE; - remove internal endpoints if they are not needed;
- do not paste full configuration files;
- if a real key has already been shared, rotate it.
A correct example:
I am getting this error while using an API key.
I replaced the real values with placeholders:
API_KEY=API_KEY_EXAMPLE
DB_HOST=example.internal
The model does not need your real key to help you understand an error.
5. HR documents and employee information
Resumes, payslips, performance reviews, disciplinary actions, certificates, health information, internal notes, salaries, holidays, absences, individual goals.
This data is sensitive by definition, even when the company is small and "we all know each other".
Asking ChatGPT to "rewrite this disciplinary notice" or "evaluate this employee" using real data can create serious risks.
The problem is not only confidentiality. It is also the context: HR, employment and people are areas where a superficial use of AI can produce wrong decisions, bias or documents built on mishandled information.
Better to do this:
- use generic templates;
- describe the case in abstract terms;
- remove names, dates, overly identifying roles and health details;
- do not ask the AI to make decisions about people;
- always have sensitive texts reviewed by whoever holds HR or legal responsibility.
It is one thing to ask: "draft me a structure for a job description".
It is another to upload a real employee evaluation and ask what to do with it.
6. Legal information, litigation and internal reports
Lawsuits, complaints, lawyers' letters, whistleblowing reports, disputes with customers or vendors, objections, pre-litigation documents.
These contents may contain personal data, confidential information, legal strategy and material that must be handled with extreme care.
ChatGPT can help clarify a text, prepare an outline or make a draft more readable.
But it should not become the place where you upload real case files without control.
Better to do this:
- work on depersonalized summaries;
- remove names, dates, amounts and identifying references;
- do not upload full attachments;
- involve a lawyer or a compliance contact in sensitive cases;
- use company-approved tools.
Here the risk is not only "what ChatGPT does with that data". The risk is that the company loses control over information that was meant to stay within a restricted perimeter.
7. Source code, logs and production configurations
Code can contain proprietary logic. Logs can contain personal data, emails, tokens, errors, IP addresses, user IDs and infrastructure details.
Even a seemingly harmless configuration can reveal architecture, services used, internal endpoints, dependencies, permissions and vulnerabilities.
This is especially true when AI is used by developers, external consultants or vendors looking for quick support.
Better to do this:
- remove secrets and credentials;
- strip user data from logs;
- replace real endpoints with examples;
- share only the necessary snippet;
- use approved company environments for recurring technical work.
It is one thing to ask for help on a generic error.
It is another to paste production logs with real data.
The right question is not "can I use ChatGPT?"
The correct question is:
which data can I use, with which tool, for which purpose and with which controls?
For an SME, this distinction is fundamental.
Banning ChatGPT often does not work. People will use it anyway, perhaps with personal accounts and without telling anyone.
The better solution is to create simple rules:
- what can be uploaded;
- what must be anonymized;
- which data is forbidden;
- which accounts or plans are approved;
- who is responsible in case of doubt;
- how to handle an error or a possible exposure.
This is AI governance applied to real life, not bureaucracy.
Quick checklist for an SME
Before pasting anything into ChatGPT, ask these questions:
- does it contain personal data?
- does it contain names of customers, vendors or employees?
- does it contain amounts, margins or confidential terms?
- does it contain passwords, tokens or API keys?
- does it contain HR, health or disciplinary information?
- does it contain legal information or litigation?
- does it contain real code, logs or configurations?
- can I get the same result using fake or anonymized data?
- am I using a personal account or an approved company environment?
- is there an internal policy that authorizes this use?
If the answer is "yes" to one of the first seven questions, stop.
It does not mean AI cannot help you. It means you have to change how you prepare the prompt, use anonymized data or choose a more controlled environment.
Frequently asked questions
Does ChatGPT Business or Enterprise solve the problem?
It reduces some risks, but it does not solve everything by itself. OpenAI states that Business, Enterprise and API data is not used to train the models by default. Still, the company must define policies, roles, access, allowed data, retention and controls.
If I turn off data sharing, can I upload everything?
No. Turning off sharing or using a business plan is an important control, but it does not automatically turn any data into uploadable data. Privacy, confidentiality, minimization and necessity remain criteria to assess.
Can I use anonymized data?
Yes, it is often the best path. But anonymizing does not just mean removing the name. You also have to remove combinations of information that can make a person or a customer recognizable.
Is Temporary Chat enough?
It can help for some uses, but it does not replace a company policy. The point is not only the chat history: it is understanding which data you are sending, on what basis, to which service and with which controls.
What should I do if someone has already uploaded sensitive data?
Do not minimize. Reconstruct what was uploaded, from which account, with which settings, what type of data was present and who might be involved. Then assess with the privacy, legal or security contact whether the event requires further action.
Conclusion
ChatGPT is not the problem.
The problem is using it without distinguishing between a harmless prompt and sensitive company content.
For an SME, the goal should not be to block AI. It should be to make it usable safely: with clear rules, minimal training, approved tools and control processes.
AI can save time. But if it enters business processes without governance, it can create risks that only become visible when it is too late.
If you want to understand which AI tools you can really use in your company, which data you can process and where you risk exposure, start from an AI assessment: processes, data, risks and concrete use cases.
Sources and references
Tags
Founder & CEO · Castaldo Solutions
Sono un consulente di trasformazione digitale con esperienza enterprise. Aiuto le PMI italiane ad adottare AI, CRM e architetture IT con risultati misurabili in 90 giorni.