Skip to content

JSON Data Validator

Abstract

A comprehensive JSON schema validation tool that validates JSON data against custom schemas, provides detailed error reporting, supports batch validation, and includes data generation capabilities. This project demonstrates advanced data validation, schema design, and file processing techniques.

🎯 Project Overview

This project creates a powerful JSON validation system with:

  • Custom JSON schema definition and validation
  • File and string validation capabilities
  • Batch validation for multiple files
  • Detailed error reporting with path information
  • Sample data generation from schemas
  • Validation statistics and reporting
  • Interactive command-line interface

✨ Features

Schema Management

  • Custom Schemas: Define and load custom JSON schemas
  • File-based Schemas: Load schemas from JSON files
  • Built-in Schemas: Pre-defined schemas for common data types
  • Schema Information: View detailed schema structure and requirements

Validation Capabilities

  • File Validation: Validate JSON files against schemas
  • String Validation: Validate JSON strings directly
  • Batch Validation: Validate multiple files with pattern matching
  • Real-time Validation: Immediate feedback with detailed error messages

Error Reporting

  • Detailed Errors: Path-based error reporting with context
  • Error Categories: Type mismatches, missing fields, constraint violations
  • Validation History: Track all validation attempts and results
  • Export Reports: Generate comprehensive validation reports

Data Generation

  • Sample Data: Generate valid sample data from schemas
  • Schema Analysis: Understand schema structure and requirements
  • Testing Support: Create test data for development and testing

🛠️ Technical Implementation

Class Structure

jsondatavalidator.py
class ValidationError:
    def __init__(self, path, message, expected=None, actual=None):
        # Detailed error information with context
        # Path tracking for nested validation errors
 
class JSONSchema:
    def __init__(self, schema):
        # Schema definition and validation logic
        # Recursive validation for nested structures
    
    def validate(self, data, path="root"):
        # Main validation entry point
        # Returns validation status and error list
 
class JSONDataValidator:
    def __init__(self):
        # Validator management and schema storage
        # Validation history and statistics tracking
jsondatavalidator.py
class ValidationError:
    def __init__(self, path, message, expected=None, actual=None):
        # Detailed error information with context
        # Path tracking for nested validation errors
 
class JSONSchema:
    def __init__(self, schema):
        # Schema definition and validation logic
        # Recursive validation for nested structures
    
    def validate(self, data, path="root"):
        # Main validation entry point
        # Returns validation status and error list
 
class JSONDataValidator:
    def __init__(self):
        # Validator management and schema storage
        # Validation history and statistics tracking

Key Components

Schema Validation

  • Type Checking: Validate data types (string, number, boolean, array, object)
  • Constraint Validation: Check length, range, pattern, and enum constraints
  • Required Fields: Ensure all required fields are present
  • Nested Validation: Recursive validation for complex data structures

Error Handling

  • Path Tracking: Precise error location with JSON path notation
  • Context Information: Expected vs. actual value reporting
  • Error Aggregation: Collect all validation errors in single pass
  • User-friendly Messages: Clear, actionable error descriptions

Data Processing

  • JSON Parsing: Robust JSON file and string parsing
  • File Operations: Safe file handling with error recovery
  • Pattern Matching: Glob pattern support for batch operations
  • Data Serialization: Export validation results and reports

🚀 How to Run

  1. Install Python: Ensure Python 3.7+ is installed (uses built-in modules)

  2. Run the Validator:

    python jsondatavalidator.py
    python jsondatavalidator.py
  3. Getting Started:

    • Explore pre-loaded sample schemas
    • Validate sample JSON files
    • Create custom schemas for your data
    • Generate sample data for testing

💡 Usage Examples

Basic Validation

jsondatavalidator.py
# Create validator instance
validator = JSONDataValidator()
 
# Add a simple schema
user_schema = {
    "type": "object",
    "required": ["name", "email"],
    "properties": {
        "name": {"type": "string", "minLength": 2},
        "email": {"type": "string", "pattern": r"^.+@.+\..+$"}
    }
}
validator.add_schema("user", user_schema)
 
# Validate data
user_data = {"name": "John", "email": "john@example.com"}
is_valid, errors = validator.validate_data(user_data, "user")
 
if is_valid:
    print("✅ Validation successful!")
else:
    for error in errors:
        print(f"❌ {error}")
jsondatavalidator.py
# Create validator instance
validator = JSONDataValidator()
 
# Add a simple schema
user_schema = {
    "type": "object",
    "required": ["name", "email"],
    "properties": {
        "name": {"type": "string", "minLength": 2},
        "email": {"type": "string", "pattern": r"^.+@.+\..+$"}
    }
}
validator.add_schema("user", user_schema)
 
# Validate data
user_data = {"name": "John", "email": "john@example.com"}
is_valid, errors = validator.validate_data(user_data, "user")
 
if is_valid:
    print("✅ Validation successful!")
else:
    for error in errors:
        print(f"❌ {error}")

File Validation

jsondatavalidator.py
# Validate JSON file
is_valid, errors = validator.validate_file("data.json", "user")
 
# Batch validate multiple files
results = validator.batch_validate("data/*.json", "user")
for file_path, (is_valid, errors) in results.items():
    print(f"{file_path}: {'✅' if is_valid else '❌'} ({len(errors)} errors)")
jsondatavalidator.py
# Validate JSON file
is_valid, errors = validator.validate_file("data.json", "user")
 
# Batch validate multiple files
results = validator.batch_validate("data/*.json", "user")
for file_path, (is_valid, errors) in results.items():
    print(f"{file_path}: {'✅' if is_valid else '❌'} ({len(errors)} errors)")

Schema Management

jsondatavalidator.py
# Load schema from file
validator.load_schema_from_file("user_schema.json", "user")
 
# Get schema information
schema_info = validator.get_schema_info("user")
print(json.dumps(schema_info, indent=2))
 
# Generate sample data
sample_data = validator.create_sample_data("user")
print(json.dumps(sample_data, indent=2))
jsondatavalidator.py
# Load schema from file
validator.load_schema_from_file("user_schema.json", "user")
 
# Get schema information
schema_info = validator.get_schema_info("user")
print(json.dumps(schema_info, indent=2))
 
# Generate sample data
sample_data = validator.create_sample_data("user")
print(json.dumps(sample_data, indent=2))

🎨 Interactive Features

  1. Validate JSON File - Validate single file against schema
  2. Validate JSON String - Validate JSON input directly
  3. Batch Validate Files - Validate multiple files with patterns
  4. Load Schema from File - Import schema from JSON file
  5. Add Schema Manually - Define schema interactively
  6. View Schema Info - Explore schema structure
  7. Generate Sample Data - Create valid sample data
  8. View Validation Statistics - Analytics dashboard
  9. Export Validation Report - Generate detailed reports
  10. List Available Schemas - View loaded schemas

Pre-loaded Sample Schemas

User Schema

{
  "type": "object",
  "required": ["name", "email", "age"],
  "properties": {
    "name": {"type": "string", "minLength": 2, "maxLength": 50},
    "email": {"type": "string", "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"},
    "age": {"type": "integer", "minimum": 0, "maximum": 150},
    "status": {"type": "string", "enum": ["active", "inactive", "pending"]}
  }
}
{
  "type": "object",
  "required": ["name", "email", "age"],
  "properties": {
    "name": {"type": "string", "minLength": 2, "maxLength": 50},
    "email": {"type": "string", "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"},
    "age": {"type": "integer", "minimum": 0, "maximum": 150},
    "status": {"type": "string", "enum": ["active", "inactive", "pending"]}
  }
}

Product Schema

{
  "type": "object",
  "required": ["name", "price", "category"],
  "properties": {
    "name": {"type": "string", "minLength": 1, "maxLength": 100},
    "price": {"type": "number", "minimum": 0},
    "category": {"type": "string", "enum": ["electronics", "clothing", "books"]},
    "tags": {"type": "array", "items": {"type": "string"}, "maxItems": 10}
  }
}
{
  "type": "object",
  "required": ["name", "price", "category"],
  "properties": {
    "name": {"type": "string", "minLength": 1, "maxLength": 100},
    "price": {"type": "number", "minimum": 0},
    "category": {"type": "string", "enum": ["electronics", "clothing", "books"]},
    "tags": {"type": "array", "items": {"type": "string"}, "maxItems": 10}
  }
}

🔧 Advanced Features

Constraint Types

String Constraints

  • minLength/maxLength: String length validation
  • pattern: Regular expression pattern matching
  • enum: Allowed values from predefined list

Number Constraints

  • minimum/maximum: Numeric range validation
  • type: Distinguish between integer and float

Array Constraints

  • minItems/maxItems: Array length validation
  • items: Schema for array elements
  • uniqueItems: Ensure array elements are unique

Object Constraints

  • required: List of required properties
  • properties: Schema for object properties
  • additionalProperties: Control extra properties

Error Reporting

jsondatavalidator.py
# Example validation error
ValidationError(
    path="root.users[0].email",
    message="String does not match pattern",
    expected="email pattern",
    actual="invalid-email"
)
jsondatavalidator.py
# Example validation error
ValidationError(
    path="root.users[0].email",
    message="String does not match pattern",
    expected="email pattern",
    actual="invalid-email"
)

Batch Validation

jsondatavalidator.py
# Validate all JSON files in directory
results = validator.batch_validate("data/*.json", "user")
 
# Advanced pattern matching
results = validator.batch_validate("**/config.json", "config")
jsondatavalidator.py
# Validate all JSON files in directory
results = validator.batch_validate("data/*.json", "user")
 
# Advanced pattern matching
results = validator.batch_validate("**/config.json", "config")

📊 Statistics and Reporting

Validation Statistics

jsondatavalidator.py
{
  'total_validations': 25,
  'successful_validations': 20,
  'failed_validations': 5,
  'success_rate': 80.0,
  'schema_usage': {
    'user': 15,
    'product': 10
  },
  'total_errors': 12,
  'loaded_schemas': ['user', 'product', 'config']
}
jsondatavalidator.py
{
  'total_validations': 25,
  'successful_validations': 20,
  'failed_validations': 5,
  'success_rate': 80.0,
  'schema_usage': {
    'user': 15,
    'product': 10
  },
  'total_errors': 12,
  'loaded_schemas': ['user', 'product', 'config']
}

Validation Report

{
  "generated_at": "2024-01-01T12:00:00.000000",
  "total_validations": 25,
  "successful_validations": 20,
  "failed_validations": 5,
  "results": [
    {
      "timestamp": "2024-01-01T12:00:00.000000",
      "schema_name": "user",
      "is_valid": true,
      "error_count": 0,
      "errors": []
    }
  ]
}
{
  "generated_at": "2024-01-01T12:00:00.000000",
  "total_validations": 25,
  "successful_validations": 20,
  "failed_validations": 5,
  "results": [
    {
      "timestamp": "2024-01-01T12:00:00.000000",
      "schema_name": "user",
      "is_valid": true,
      "error_count": 0,
      "errors": []
    }
  ]
}

🛡️ Error Handling

  • JSON Parsing: Graceful handling of malformed JSON
  • File Operations: Safe file reading with error recovery
  • Schema Validation: Comprehensive error checking and reporting
  • Type Safety: Runtime type checking and validation

📚 Learning Objectives

  • JSON Schema: Understanding JSON schema specification
  • Data Validation: Implementing robust validation logic
  • Error Handling: Comprehensive error reporting systems
  • File Operations: Safe file handling and processing
  • Pattern Matching: Working with glob patterns and regular expressions

🎯 Real-world Applications

API Development

  • Validate API request/response data
  • Ensure data consistency across services
  • Generate API documentation from schemas
  • Create test data for API testing

Configuration Management

  • Validate application configuration files
  • Ensure required settings are present
  • Check configuration value constraints
  • Generate default configuration templates

Data Processing

  • Validate data imports and exports
  • Check data quality before processing
  • Ensure data format consistency
  • Generate sample data for testing

🚀 Potential Enhancements

  1. JSON Schema Draft Support: Implement full JSON Schema Draft 7/2019-09
  2. Custom Validators: Allow custom validation functions
  3. Web Interface: Create Flask web application
  4. Database Integration: Store schemas and validation results in database
  5. CLI Tool: Command-line interface for automation
  6. Plugin System: Extensible validation framework
  7. Performance Optimization: Streaming validation for large files
  8. Schema Registry: Centralized schema management system

🔍 Schema Examples

Complex Nested Schema

jsondatavalidator.py
config_schema = {
    "type": "object",
    "required": ["app_name", "version", "database"],
    "properties": {
        "app_name": {"type": "string", "minLength": 1},
        "version": {"type": "string", "pattern": r"^\d+\.\d+\.\d+$"},
        "database": {
            "type": "object",
            "required": ["host", "port"],
            "properties": {
                "host": {"type": "string"},
                "port": {"type": "integer", "minimum": 1, "maximum": 65535},
                "ssl": {"type": "boolean"}
            }
        },
        "features": {
            "type": "array",
            "items": {"type": "string"},
            "uniqueItems": True
        }
    }
}
jsondatavalidator.py
config_schema = {
    "type": "object",
    "required": ["app_name", "version", "database"],
    "properties": {
        "app_name": {"type": "string", "minLength": 1},
        "version": {"type": "string", "pattern": r"^\d+\.\d+\.\d+$"},
        "database": {
            "type": "object",
            "required": ["host", "port"],
            "properties": {
                "host": {"type": "string"},
                "port": {"type": "integer", "minimum": 1, "maximum": 65535},
                "ssl": {"type": "boolean"}
            }
        },
        "features": {
            "type": "array",
            "items": {"type": "string"},
            "uniqueItems": True
        }
    }
}

Sample Generated Data

{
  "app_name": "sample_string",
  "version": "1.0.0",
  "database": {
    "host": "sample_string",
    "port": 42,
    "ssl": true
  },
  "features": ["sample_string"]
}
{
  "app_name": "sample_string",
  "version": "1.0.0",
  "database": {
    "host": "sample_string",
    "port": 42,
    "ssl": true
  },
  "features": ["sample_string"]
}

🏆 Project Completion

This JSON Data Validator demonstrates:

  • ✅ Complete JSON schema validation implementation
  • ✅ Comprehensive error reporting and tracking
  • ✅ Batch processing and file operations
  • ✅ Data generation and schema analysis
  • ✅ Interactive user interface
  • ✅ Statistics and reporting capabilities

Perfect for beginners learning data validation concepts and intermediate developers working with JSON data processing and API development!

Was this page helpful?

Let us know how we did