6 min read

CSV ⇄ JSON Without Headaches: Delimiters, Encoding, and Header Rules

Avoid broken rows and weird characters. This guide covers commas vs semicolons, UTF-8, header rows, quoting, and a safe workflow using our CSV → JSON and JSON/XML/CSV Converter.

Spreadsheet converting to JSON

Quick start: clean conversions in 90 seconds

  1. Open your file in a plain-text editor and confirm it’s UTF-8 (no BOM).
  2. Check if the file uses comma, semicolon, tab, or pipe as a delimiter.
  3. Ensure there’s a single header row of unique names (no blanks, no duplicates).
  4. Wrap fields with commas/newlines/quotes inside double quotes and escape quotes as "".
  5. Convert using CSV → JSON, preview, then export. For the reverse, use JSON/XML/CSV Converter.

Choose the right delimiter

CSV literally means “comma-separated values”, but regional settings or datasets sometimes prefer others:

  • Comma (,) — most common. Conflicts if descriptions contain commas.
  • Semicolon (;) — common in locales where comma is a decimal separator.
  • Tab (TSV) — great when fields include punctuation; still quote if fields contain newlines/tabs.
  • Pipe (|) — useful for logs or messy text data.
# Example (semicolon-delimited)
id;name;amount
1;Ava;12,50

Rule of thumb: pick a delimiter unlikely to appear in your data. If it might, quote the field.

Quoting & escaping (RFC 4180 style)

  • Fields containing delimiter, quote, or newline must be wrapped in double quotes.
  • Inside quoted fields, escape quotes by doubling them: "He said ""hello""".
# ❌ Bad
id,name,notes
1,Ava,He said "hello", then left
# ✅ Good
id,name,notes
1,Ava,"He said ""hello"", then left"

Multiline text (addresses, bios) is valid when quoted—just make sure your tooling preserves line breaks.

Encoding: UTF-8, BOM, Windows-1252

  • UTF-8 (no BOM) is safest for web and APIs. Avoid adding a BOM; some parsers treat it as data.
  • Windows-1252/Latin-1 files show “smart quotes” or accented letters incorrectly—resave as UTF-8.
  • When in doubt, open in a hex-view or modern editor and convert to UTF-8 before processing.
# Quick check (Linux/macOS):
file -I data.csv
# Convert if needed:
iconv -f WINDOWS-1252 -t UTF-8 data.csv -o data-utf8.csv

Header rows & field names

  • Use exactly one header row with unique, descriptive names.
  • Avoid spaces/symbols; prefer snake_case or camelCase.
  • No duplicates, no blanks. If a column is empty, either fill or remove it.
# ✅ Good
id,full_name,email,signup_date,is_active
# ❌ Risky
ID,Name,Name,email,,Active?

When converting to JSON, each row becomes an object using header names as keys.

Types: numbers, dates, booleans, nulls

  • Numbers: avoid thousands separators; keep 1234.56. For IDs or credit-card-like values, store as strings to preserve leading zeros.
  • Dates: prefer ISO-8601 (2025-04-18 or 2025-04-18T10:30:00Z).
  • Booleans: normalize to true/false (or 1/0); avoid mixed “Yes/No/TRUE/False”.
  • Nulls: represent as empty field, null (JSON), or a sentinel consistently (e.g., NA).
# CSV → JSON example
id,name,zip,is_active,signup_date
1,Ava,01234,true,2025-04-18

[
  {"id":1,"name":"Ava","zip":"01234","is_active":true,"signup_date":"2025-04-18"}
]

Large files & streaming

  • Prefer streaming parsers or split files into chunks (e.g., 50–200MB) for reliability.
  • For logs, consider NDJSON (one JSON object per line) to append/stream efficiently.
  • Validate a small sample first; then run the full dataset.

Safe CSV ⇄ JSON workflow (ToolsUseful)

  1. Open CSV → JSON and load your CSV.
  2. Pick the correct delimiter (auto-detect helps).
  3. Confirm header row and preview the first 50 rows.
  4. Export to JSON. For reverse conversions, use JSON/XML/CSV Converter.
  5. Validate in your app or with a schema before publishing.

Tip: keep a “data dictionary” (column name → description, type, examples) in your repo.

Troubleshooting

Rows shift / columns misaligned: a field contains a delimiter without quotes—wrap it and escape quotes as "".

Garbled characters (�): file isn’t UTF-8—convert and retry.

Header becomes a row: missing or duplicated header—fix to a single, unique header row.

Leading zeros disappear: treat as strings in JSON or set column type to text in your importer.

Previous

JSON Formatter 101: Common Errors and Quick Fixes

Next

UNIX Timestamp Converter: Time Zones, Milliseconds & Pitfalls

Browse all tools