# Bongo Import

Laravel package for importing legacy Bongo CMS content (pages, posts, reviews) from a remote MySQL database into Bongo v3.0. Handles nested content structures, HTML transformation, URL rewriting, and image migration.

[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

## Features

- **Interactive Import Wizard** - CLI prompts for database credentials and content selection
- **Page Import** - Migrates pages with nested chunks/texts structure
- **Post Import** - Migrates blog posts with HTML content
- **Review Import** - Migrates customer reviews with rating conversion
- **HTML Cleaning** - Strips styles, classes, empty tags; preserves semantic content
- **URL Transformation** - Rewrites legacy URLs to new route patterns
- **Image Migration** - Downloads remote images and stores them locally with polymorphic relationships
- **Reset Utility** - Truncates destination tables and deletes downloaded images

## Requirements

- PHP 8.2+
- Laravel 10+
- Composer
- Access to legacy Bongo database (MySQL)

## Installation

### Composer

```bash
composer require bongo/import
```

### Laravel Integration

The service provider is automatically registered via Laravel's package discovery. For Laravel 5 (legacy), manually add to `config/app.php`:

```php
'providers' => [
    Bongo\Import\ImportServiceProvider::class,
],
```

## Configuration

Publish and edit the configuration file to set default database credentials:

```bash
php artisan vendor:publish --provider="Bongo\Import\ImportServiceProvider"
```

Edit `config/import.php`:

```php
return [
    'site_url' => 'https://example.com',    // Remote site URL for image downloads
    'db_host' => '127.0.0.1',               // Legacy database host
    'db_name' => 'legacy_db',               // Legacy database name
    'db_username' => 'username',            // Legacy database username
    'db_password' => 'password',            // Legacy database password
];
```

**Note**: These are used as defaults in the interactive wizard. You'll be prompted to confirm or override them at runtime.

## Usage

### Running an Import

```bash
php artisan import:run
```

The wizard will prompt you for:
1. **Site URL** - The remote Bongo site URL (for downloading images)
2. **Database credentials** - Host, database name, username, password
3. **Content selection** - Which content types to import (pages, posts, reviews)

The command will:
- Create a temporary database connection to the legacy database
- Test the connection
- Import selected content types
- Download and store images
- Display progress and any errors

### Resetting Before Re-import

If you need to re-run an import, first reset the destination data:

```bash
php artisan import:reset
```

This command will:
- Truncate tables: `pages`, `posts`, `reviews`, `images`, `imageables`
- Delete storage directories: `/public/pages`, `/public/posts`, `/public/cache`

### Example Workflow

```bash
# Reset existing data
php artisan import:reset

# Run fresh import
php artisan import:run

# Answer prompts:
# Site URL: https://old-site.com
# Database Host: 127.0.0.1
# Database Name: legacy_bongo
# Database Username: root
# Database Password: ********
# Import pages? yes
# Import posts? yes
# Import reviews? yes
```

## How It Works

### Data Flow

```
Legacy Database (MySQL)
    ↓
Import Command (prompts for credentials)
    ↓
Service Classes (transform data)
    ├─ ImportPageFromBongo
    ├─ ImportPostFromBongo
    └─ ImportReviewFromBongo
    ↓
HTML Processing (clean, transform URLs)
    ↓
Image Download (store locally)
    ↓
Destination Models (Bongo v3.0)
    ├─ Bongo\Page\Models\Page
    ├─ Bongo\Post\Models\Post
    └─ Bongo\Review\Models\Review
```

### HTML Transformation

The import process cleans legacy HTML by:
- Removing `<br>` tags
- Stripping inline styles and classes
- Removing empty tags
- Wrapping icons and shortcodes
- Keeping only semantic tags: `h1-h6`, `p`, `b`, `i`, `ul`, `li`, `a`, `img`
- Fixing malformed markup

### URL Rewriting

Legacy URLs are transformed to new patterns:
- `?p=page.name` → `/page-name`
- `?review=reviews/slug` → `/reviews/slug`
- `?blog=blogs/archive/YYYY/MM/DD/name.aspx` → `/posts/name`
- Domain names are removed from internal links

### Image Migration

Images are:
1. Extracted from `<img>` tags in content
2. Downloaded from the remote site via `file_get_contents()`
3. Stored in `storage/app/public/{type}/{model_id}/{filename}`
4. Registered in the `images` table with dimensions and metadata
5. Attached to models via polymorphic relationships
6. References updated to `/photos/{filename}` in content

### Data Transformations

- **Review ratings**: 10-point scale → 5-star (`round($rank / 2)`)
- **Post slugs**: Dots replaced with dashes, then slugified
- **Post meta descriptions**: Generated from content if missing (150 character limit)

## Architecture

### Service Provider

Extends `Bongo\Framework\Providers\AbstractServiceProvider` for automatic bootstrapping:

```php
class ImportServiceProvider extends AbstractServiceProvider
{
    protected string $module = 'import';

    protected array $commands = [
        ImportCommand::class,
        ResetCommand::class,
    ];
}
```

### Commands

- **ImportCommand** (`import:run`) - Interactive import wizard
- **ResetCommand** (`import:reset`) - Database and storage cleanup

### Services

- **BongoImport** (abstract) - Base class for HTML cleaning and URL transformation
- **ImportPageFromBongo** - Imports pages with chunks/texts structure
- **ImportPostFromBongo** - Imports blog posts
- **ImportReviewFromBongo** - Imports customer reviews

### Models (Source Database)

All models use `protected $connection = 'import'` to query the legacy database:

- **Page** - Maps to `pages` table, has many chunks
- **Chunk** - Maps to `chunks` table, belongs to page, has many texts
- **Text** - Maps to `text` table, contains `raw_text` column
- **Post** - Maps to `blogposts` table
- **Review** - Maps to `reviewposts` table

### Helpers

- **ImageHelper** - Static methods for image downloading and storage

## Documentation

- **ARCHITECTURE.md** - Detailed architecture, class diagrams, flow diagrams, extension points
- **CLAUDE.md** - Quick reference for Claude Code with key files and commands
- **.cursorrules** - Cursor AI instructions with coding conventions
- **.github/copilot-instructions.md** - GitHub Copilot code templates

## Development

### Running Tests

```bash
vendor/bin/phpunit
```

### Code Style

This package uses Laravel Pint for code style enforcement:

```bash
# Check for style issues
vendor/bin/pint --test

# Fix style issues automatically
vendor/bin/pint
```

## Source Database Structure

### Pages Table
```
pages (table)
├─ id
├─ title
├─ titleexten (meta title extension)
├─ description (meta description)
├─ hidden (0=visible, 1=hidden)
└─ ogimage (open graph image filename)
    │
    ├─ chunks (table)
    │  ├─ id
    │  ├─ page_id
    │  └─ order
    │     │
    │     └─ text (table)
    │        ├─ id
    │        ├─ chunk_id
    │        ├─ text_index
    │        └─ raw_text (HTML content)
```

### Posts Table
```
blogposts (table)
├─ id
├─ title
├─ pagename (URL slug with dots)
├─ blog (HTML content)
├─ pagemeta (meta description, optional)
├─ postdate (publication date)
└─ unpublish (0=published, 1=draft)
```

### Reviews Table
```
reviewposts (table)
├─ id
├─ title
├─ contactname (reviewer name)
├─ contactemail (reviewer email)
├─ review (review content)
├─ rank (0-10 rating scale)
├─ reviewstate (1=approved, 0=pending)
├─ unpublish (0=published, 1=draft)
└─ modwhen (modified date)
```

## Troubleshooting

### Connection Issues

**Problem**: "Could not connect to the database"

**Solutions**:
- Verify host, database name, username, and password
- Check firewall rules and network access
- Ensure MySQL port 3306 is accessible
- Test connection with MySQL client

### Missing Images

**Problem**: "Remote image not found"

**Solutions**:
- Verify site URL is correct and accessible
- Check image paths in legacy database
- Confirm network/firewall allows outbound connections
- Test image URLs in browser

### Duplicate Content

**Problem**: Running import multiple times creates duplicates

**Solutions**:
- Use `import:reset` before re-importing
- Modify `firstOrNew` logic to use stricter unique constraints
- Check for existing records before importing

### Memory Issues

**Problem**: "Allowed memory size exhausted"

**Solutions**:
- Increase PHP memory_limit in php.ini
- Process imports in smaller batches
- Use chunked queries for large datasets

### Broken HTML

**Problem**: Content looks corrupted after import

**Solutions**:
- Review HTML cleaning rules in `BongoImport::cleanHtml()`
- Verify allowed tags list matches content requirements
- Inspect legacy markup structure for edge cases
- Add custom cleaning logic for specific patterns

## Security Considerations

### Database Credentials
- Prompted interactively (not stored in version control)
- Change config defaults for production environments
- Consider using Laravel's encrypted environment variables

### Image Downloads
- Currently uses `CURLOPT_SSL_VERIFYPEER = 0` (development only!)
- Update for production to verify SSL certificates
- Validate file types before downloading
- Implement rate limiting for large imports

### HTML Content
- No XSS sanitisation applied (assumes trusted source)
- Consider adding additional sanitisation for user-facing content
- Review allowed tags list for security implications

## License

This package is proprietary software developed by Bespoke.ws Ltd. See [LICENSE](LICENSE) for details.

## Authors

- **Stuart Elliott** - [Bespoke.ws Ltd](https://bespokeuk.com)

## Support

For issues and questions:
- Email: stuart.elliott@bespokeuk.com
- Repository: https://bitbucket.org/designtec/import
