長期インターンで取り組むWordPress診断ツール開発 — 5月の振り返り

2026年6月2日2026年6月4日

5月から、QuinQueの長期インターンとして WordPressサイト診断ツール（Phase 1 MVP）の開発に参加しています。営業チームが外部サイトのURLリストを渡すと、セキュリティや構成などをスキャンし、スコアやレポートを出す——そんなプロダクトの最初の2週間分を、チームと一緒に形にしてきました。

この記事では、5月に自分が担当した範囲と、そこから学んだことを振り返ります。

プロジェクトの全体像

このツールの最終的な流れは次のようなイメージです。

CSVやURLから対象サイトを読み込む
HTTPでHTMLを取得し、WordPressかどうか・テーマ・プラグインなどを検出する
結果をDBに保存し、スコアリングやダッシュボード、レポートにつなげる

5月時点では、「土台（Walking Skeleton）」と「フィンガープリント（サイトの指紋）検出」までが主なスコープでした。ダッシュボードUIや本格的なスコアリングは、これからのフェーズです。

私は Intern A として、共有型定義・DB層・CLIの骨組み、そしてテーマとプラグインの検出ロジックを担当しています。Intern B（Chanchal）とは、同じフィンガープリントWorkerの中で、WordPress判定やバージョン検出などを分担して進めました。

Week 1：開発の土台づくり（5/18–5/22）

最初の1週間は αフェーズ — Walking Skeleton です。
「いきなり高度な検出ロジックを書く」のではなく、モノレポがビルドでき、CIが通り、1つのURLをCLI経由でDBに保存できる状態を目指しました。

この週に取り組んだ主な内容は以下の通りです。

pnpm workspaces + Turborepo によるモノレポ構成のセットアップ
packages/core — Zodスキーマ（Target, ScanResult, Finding など）で、全パッケージ共通の型とバリデーション
packages/storage — SQLite + Drizzle ORM、マイグレーション、Repositoryパターン
apps/cli — CSV読み込みと scan コマンドのスケルトン（URL → DB保存）
統合テスト・CI・デモ準備 — GitHub Actions、PRワークフローの実践

金曜のデモでは、 pnpm scan でCSVや単一URLをSQLiteに取り込めることを確認しました。まだ「本格的なWordPress解析」までは届いていませんが、プロダクトの背骨ができた週でした。

この時期に印象的だったのは、設計書（Intern A向け詳細設計）を読みながら、「どのパッケージに何を置くか」を自分で整理する必要があったことです。core / storage / cli の境界を意識してコードを書くことで、後のWorker実装がしやすくなったと感じています。

Week 2：フィンガープリント検出（5/25–5/29）

2週目は βフェーズ — Fingerprint complete です。
外部サイトのHTMLからテーマとプラグインを検出するロジックを、日単位で積み上げました。

テーマ検出（Day 6）

HTML内の /wp-content/themes/{slug}/ パスからテーマslugを抽出
可能であれば style.css のヘッダー（Theme Name, Version, Author など）をパース
Cheerioを使ったHTML解析と、純粋関数としてのdetector設計（HTTPはWorker/スクリプト側）

プラグイン検出 — 2つの方法（Day 7–8）

方法1：アセットパス（Day 7）

/wp-content/plugins/{slug}/ を含む script / link URL からプラグインを検出。
根拠URL（assetUrls）を残せるので、信頼度が高い方法です。

方法2：シグネチャ（Day 8）

20種類の主要プラグインについて、HTML内の特徴的な文字列（クラス名、スクリプト名など）を辞書化し、部分一致で検出。
アセットURLが表に出ないサイトでもヒットできる反面、ページ本文に単語が出るだけで偽陽性になり得る、というトレードオフも学びました。

マージ（Day 9）

Day 7とDay 8の結果をslug単位で1行に統合する plugins-merge を実装しました。

sources : asset / signature / 両方
assetUrls と matchedPatterns でなぜ検出されたかを残す
detectPlugins(html) という1つの入口関数に整理

Vitestで 27件のテスト まで積み上がり、fixtureベースの品質担保ができました。

実サイトでの精度確認（Day 10）

ユニットテストだけでは足りない、という段階で 10サイト実測 に取り組みました。

week2-fingerprint-probe.mjs — 実URLのHTMLを取得し、detectorを実行
week2-fingerprint-sites.csv — wordpress.org, Elementor, WPBeginner など10サイト
pnpm demo:week2 — 一括実行と結果JSONの保存
精度メモ — 偽陽性・取りこぼしを表形式で記録

10/10サイト でプローブ成功。一方で、

マーケティング系トップページでは テーマが取れない ことがある
signature は jetpack や woocommerce など、本文の言及だけでヒットすることがある
一部サイト（contactform7.com）は 403 でブロックされ、代替URLに差し替え

といった 正直な限界 もドキュメントに残しました。
「動く」だけでなくどこまで正しいかを示すのが、Day 10の価値だったと思います。

開発を通じて学んだこと

小さく確実に — Walking Skeletonの意味
最初から完璧な scan --json を目指すのではなく、
型 → DB → CLI → 検出器 → 実測 と段階を踏むことで、毎週デモ可能な状態を保てました。
純粋関数とHTTPの分離
detectorは HTML文字列だけ を受け取り、fetchは外側（probe / 将来のWorker）に置く。
このルールのおかげで、Vitestが速く、テストが安定し、Intern Bとの分担もしやすかったです。
2つの検出方法の「マージ」
同じプラグインが2経路で見つかるとき、単に配列を足すのではなく、slugで統合し、根拠を残す設計が重要だと学びました。
後のスコアリングやレポートでも、この形がそのまま使えるはずです。
PRとレビューを日常業務に
Week 1から 小さなPRをこまめに 出す運用を身につけました。
Day 6〜10も、theme → plugins-asset → signature → merge → probe と 1日1テーマ でPRを分け、レビューしやすくしました。
実世界はfixtureより難しい
テスト用HTMLは「きれい」ですが、本番サイトはCDN、minify、bot対策、マーケページなど要素が多い。
実測とメモがあると、次フェーズの改善（signatureの厳格化、slug正規化、 scan --json 統合）の優先順位がはっきりします。

5月のマイルストーン

時期	内容
Week 1 終了	モノレポ + CI + CSV/URL → SQLite
Day 6	テーマ検出
Day 7–8	プラグイン検出（asset + signature）
Day 9	マージ + 27 tests
Day 10	10サイト実測 + デモ資料

6月以降に向けて

6月からはスコアリングと Next.jsダッシュボードのセットアップ（Week 3）に入る予定です。
5月に作ったフィンガープリント結果が、いずれ スコア・可視化・レポート につながっていく——その流れが見えてきたのが、今月いちばんの収穫です。

最後に

5月は、ツールの使い方を覚えるだけでなく、設計書を読み、パッケージ境界を決め、テストとPR で品質を守り、実サイトで検証する —— プロダクト開発の一連の流れを体験できた1ヶ月でした。

メンターの方々は、答えをすぐに教えるのではなく、自分で要件を整理してから相談する余白を残してくれます。その中で、Intern Bと並行して同じWorkerパターンで進める協業も学べました。

まだ pnpm scan --json の完全統合やダッシュボードはこれからですが、WordPressサイトの「指紋」を取る部分については、コード・テスト・実測まで一通り経験できたと感じています。

「本物のプロダクトコードに触れながら、設計と品質の両方を学びたい」—— そう思っている方にとって、QuinQueのインターンはきっと良い環境だと思います。

以下、原文。

Building a WordPress Diagnostic Tool During My Internship — A May Retrospective

Starting in May 2026, I joined QuinQue's long-term internship program to work on Phase 1 of a WordPress site diagnostic tool. The product will eventually help a sales team scan external websites, score them, and generate reports. In May, my focus was the first two weeks of the roadmap: building the foundation and implementing fingerprint detection (theme + plugins).

This post reflects on what I built and what I learned.

Project overview

The long-term flow looks like this:

Read target URLs from CSV or CLI input
Fetch HTML and detect WordPress fingerprints (theme, plugins, version, etc.)
Store results, compute scores, and expose them in a dashboard and reports

In May, we scoped Week 1 (Walking Skeleton) and Week 2 (Fingerprint). Scoring, dashboard UI, and LLM features come in later phases.

As Intern A, I owned shared schemas, the storage layer, CLI skeleton, and theme/plugin detectors. I paired with Intern B (Chanchal) on the fingerprint worker (WordPress detection, version, PHP/server signals).

Week 1: Laying the foundation (May 18–22)

Week 1 was the α phase — Walking Skeleton.
The goal was not deep detection yet, but proving:
The monorepo builds CI passes One URL or CSV can flow through CLI → validation → SQLite
What I worked on:

pnpm workspaces + Turborepo monorepo setup
packages/core — Zod schemas ( Target , ScanResult , Finding , etc.)
packages/storage — SQLite + Drizzle ORM, migrations, repositories
apps/cli — CSV import and scan command skeleton
Integration tests, CI, demo scripts — GitHub Actions and PR workflow practice

By Friday, we could run pnpm scan to import targets into SQLite. It was the backbone of the product, even before real WordPress parsing.

Reading the Intern A design doc and deciding what belongs in core vs storage vs cli was a valuable exercise. Clear package boundaries made Week 2 much smoother.

Week 2: Fingerprint detection (May 25–29)

Week 2 was the β phase — Fingerprint complete.
I implemented theme and plugin detection day by day.

Theme (Day 6)

Extract theme slug from HTML paths, optionally parse style.css headers (name, version, author).
Detectors are pure functions — HTTP stays outside.

Plugins — two methods (Days 7–8)

Asset paths: detect from /wp-content/plugins/{slug}/ in script/link URLs (high confidence, keeps evidence URLs)
Signatures: dictionary of ~20 popular plugins, case-insensitive HTML substring match (catches plugins without visible asset paths, but can false-positive on marketing copy)

Merge (Day 9)

Combine both lists by slug into one row with sources, assetUrls , and matchedPatterns.
Single entry point: detectPlugins(html) . 27 Vitest tests across theme + plugin detectors.

Live accuracy run (Day 10)

Fixtures are not enough — I built a probe script for 10 real sites (wordpress.org, Elementor, WPBeginner, etc.), ran pnpm demo:week2, and documented false positives and misses.
Results: 10/10 sites probed successfully, with honest notes:

Theme detection is strong when style.css appears in HTML; weak on marketing homepages
Asset-based plugin detection is reliable
Signature matching needs Phase 2 tuning (e.g. jetpack / woocommerce from page text only)
Some hosts block bots (HTTP 403) — we swapped URLs and documented it

What I learned

Walking Skeleton first — types → DB → CLI → detectors → live probe, demoable every week
Separate pure detectors from HTTP — fast, stable tests; easier pairing with Intern B
Merge with evidence — one slug per plugin, keep how it was detected
Small PRs daily — easier review, clearer history
Real sites ≠ fixtures — live runs reveal CDN, minification, bot blocking, and false positives

May milestones

When	What
End of Week 1	Monorepo + CI + CSV/URL → SQLite
Day 6	Theme detector
Days 7–8	Plugin detection (asset + signature)
Day 9	Merge layer + 27 tests
Day 10	10-site live probe + accuracy docs

Looking ahead to June

Week 3 starts scoring and Next.js dashboard setup. May's fingerprint work will eventually feed scores, charts, and reports — and seeing that pipeline take shape was my biggest takeaway.

Final thoughts

May was not just about learning tools. It was about reading design docs, drawing package boundaries, protecting quality with tests and PRs, and validating on real websites.

Mentors give space to think before asking for help. Working in parallel with Intern B on the same worker pattern taught me practical collaboration.

Full pnpm scan --json integration and the dashboard are still ahead, but I'm proud of shipping the fingerprint slice end to end: code, tests, and live accuracy notes.

If you want an internship where you touch real product code and grow in both implementation and design thinking, QuinQue is a strong environment.

長期インターンで取り組むWordPress診断 ツール開発 — 5月の振り返り