Back to Software
Software Engineering

LogStudio

A lightweight log ingestion + exploration tool: CLI ingests mixed-format logs into SQLite with FTS5, and a Blazor UI enables fast search + correlation by trace/request/session.

dotnetblazorsqlitefts5observabilitydevtoolsef-core

Overview

I built LogStudio to solve a practical debugging problem: quickly explore large log sets with both structured JSON logs and plaintext logs, and correlate events by trace ID / request ID / session ID.

Highlights:

  • CLI ingestion from folders/patterns into a local SQLite DB
  • Automatic parsing of JSON logs and plaintext logs with trailing JSON payloads
  • SQLite FTS5 full-text search for fast “message contains” queries
  • Blazor UI (MudBlazor) with quick filtering, virtualization, and detail views
  • Minimal API layer for query + log detail retrieval

The problem I was solving

When debugging backend systems, raw log files create friction:

  • logs are large and slow to search
  • formats vary by source (JSON vs plaintext)
  • correlating across request boundaries is manual
  • “grep” gets you a line, not the context

I wanted a tool that feels like a local “mini observability console”:

  • ingest quickly
  • search instantly
  • click into an event and see context + correlation threads

Architecture

LogStudio is split into clear layers:

  • Core
    • Domain model: LogEvent
  • Infrastructure
    • EF Core + SQLite storage
    • Ingestion pipeline
    • FTS5 initialization and indexing
  • Host
    • Web app (Blazor interactive server rendering)
    • Minimal API endpoints for querying
    • UI components for exploration
  • Tools
    • CLI that ingests logs from a glob/pattern

This separation keeps parsing/storage logic testable and reusable outside the UI.


Storage & indexing strategy

Why SQLite

  • Local-first workflows: no infra required
  • Great fit for “I want to inspect logs now”
  • Easy sharing of a single .db file as an artifact

Key schema objects

  • Events table (EF Core entity LogEvent)

    • Timestamp, level, logger, message, source file, line number
    • Optional correlation IDs: trace/request/session
    • JSON properties (PropsJson)
  • LogEventFts virtual table (FTS5)

    • Message, PropsJson
    • Linked to Events via rowid = Events.Id
-- External content FTS5 table: indexes without duplicating data
PRAGMA journal_mode=WAL;
CREATE VIRTUAL TABLE IF NOT EXISTS LogEventFts
USING fts5(Message, PropsJson, content='');

The content='' option creates an “external content” FTS table—searches use the index, but actual text lives in the Events table. This saves significant storage on large log datasets.

WAL mode

The DB initializes in WAL mode, improving concurrency and write performance for the ingestion pattern.


Ingestion pipeline

Input support

The ingestor supports two common patterns:

  1. Pure JSON logs
  • If a line begins with {, it attempts JSON parse and extracts:
    • timestamp (timestamp, ts, or @t)
    • level (level, loglevel, or @l)
    • message (message or @m)
    • message template (messageTemplate or @mt)
    • logger (logger, sourceContext, or category)
    • trace/request/session IDs (including correlationId → requestId)
    • property payload (props) or any remaining fields as “extras”
  1. Plaintext logs with optional trailing JSON
  • Attempts to parse a leading timestamp
  • Extracts a trailing JSON payload if it exists after a {
  • Pulls trace/request/session IDs out of that JSON when present
  • Stores message text and props cleanly

Batch behavior and transactions

  • Ingestion runs inside a DB transaction per file
  • Records flush in batches (e.g., 2000 events) to balance memory and throughput
  • After inserting events via EF Core, the pipeline also inserts into FTS5 with the same IDs
// Memory-efficient streaming with IAsyncEnumerable
private static async IAsyncEnumerable<string> ReadLinesAsync(string path,
    [EnumeratorCancellation] CancellationToken ct = default)
{
    await using var stream = new FileStream(path, FileMode.Open, FileAccess.Read,
        FileShare.ReadWrite | FileShare.Delete);  // Read active log files
    using var sr = new StreamReader(stream);
    while (!sr.EndOfStream)
    {
        ct.ThrowIfCancellationRequested();
        var s = await sr.ReadLineAsync();
        if (s is not null) yield return s;
    }
}

The FileShare.ReadWrite flag allows reading logs that are still being written to—essential for tailing active application logs.

This approach keeps ingestion fast without compromising query performance later.


Query model (API)

Query endpoint

The minimal API accepts a query payload like:

  • date range (FromUtc, ToUtc)
  • levels filter (Levels) and/or minimum level (MinLevel)
  • sorting (SortBy, SortDirection)
  • paging (Offset, Limit)
  • text filter (Text)

Query language ergonomics

To make it feel natural, the text field supports:

  • trace:<guid> or trace=<guid>
  • request:<guid> / req:<guid> / corr:<guid> etc.
  • session:<string>
  • id:<guid> (matches trace or request)
  • a raw GUID (auto-detected)

This prevents you from needing a “separate UI” just for correlation IDs.

Two search modes

  1. FTS mode
  • If text starts with message ~ ..., the query uses FTS5:
    • Id IN (SELECT rowid FROM LogEventFts WHERE LogEventFts MATCH $q)
  1. LIKE mode
  • Otherwise, it searches:
    • Message LIKE $like OR PropsJson LIKE $like
// Correlation ID prefix parsing for ergonomic queries
if (lowered.StartsWith("trace:") || lowered.StartsWith("trace="))
{
    target = IdTarget.Trace;
}
else if (lowered.StartsWith("request:") || lowered.StartsWith("req:") ||
         lowered.StartsWith("correlation:") || lowered.StartsWith("corr:"))
{
    target = IdTarget.Request;
}
else if (lowered.StartsWith("session:"))
{
    target = IdTarget.Session;
}
else if (Guid.TryParse(text, out _))
{
    target = IdTarget.Any;  // Auto-detect raw GUIDs
}

This query DSL lets you type trace:abc123 or paste a raw GUID and get instant correlation filtering without switching UI modes.

Parameterized SQL for safety and speed

Where it matters, the query path uses parameterized SQL to avoid injection and to keep the query planner stable.


Log detail experience

The log detail endpoint returns:

  • the selected event
  • structured properties (parsed from JSON)
  • “pretty printed” JSON for readability
  • context:
    • 5 events before
    • 5 events after
  • correlation sets:
    • related by trace ID
    • related by request ID
    • related by session ID

In practice, this reduces debugging time because you stop “guessing what happened around it.”


UI design

The UI is Blazor (interactive server rendering) with MudBlazor components.

Key UX choices:

  • Virtualized grids so large datasets remain responsive
  • Quick toggles for detail visibility
  • A log detail dialog that emphasizes:
    • the message
    • the properties (with kind/type)
    • the context and correlation sets

Performance notes

What makes it fast:

  • SQLite indexing on timestamp + correlation IDs
  • FTS5 for full-text search
  • batch ingestion + WAL mode
  • lightweight API payloads (query rows are DTOs)

What I’d add next

  • Pluggable parsers (Serilog/ELK formats, NLog layouts)
  • Saved searches + bookmarks
  • Export correlated threads (trace/request) to a single shareable view
  • Retention & DB compaction controls
  • Import from a streaming source (HTTP tailing)

Results

This content appears in both Overview and Deep Dive modes.

  • Dramatically reduced “log spelunking” time by making search and correlation interactive.
  • Converted messy log files into a durable, queryable dataset with strong UX affordances.
  • Produced a reusable pattern: ingest → index → query → correlate → explain.

Technologies Used

  • .NET (targeting modern runtime)
  • Blazor (interactive server rendering)
  • MudBlazor UI components
  • EF Core + SQLite
  • SQLite FTS5 full-text search
  • Minimal APIs