Project Mirrorfall - Detailed Writeup#

Challenge#

Name: PROJECT MIRRORFALL: The Exquisite Dilemma of Offence vs Defence
Given file: qn.md
Expected flag format: apoorvctf{...}

The challenge gives three linked objectives:

Find the correct Snowden archive PDF and extract a file-specific commit fragment (Variable X).
Parse the PDF and identify the second ECI codeword after APERIODIC.
Embed that codeword using all-MiniLM-L6-v2 and extract/round the first value (Variable Y).

Final answer found:

apoorvctf{7d88323_0.0245}

Step 0 - Read the prompt#

Read qn.md:

read qn.md

bash

Key clues:

“public archive serving as an archival mirror for the 2013 intelligence disclosures”
“raw PDF classification guide dated September 5, 2013”
“overarching US encryption defeat program”
“first ECI listed is APERIODIC; find second ECI”
“use all-MiniLM-L6-v2 and take embedding[0], round 4 decimals”

This strongly points to Snowden document mirrors and specifically the NSA BULLRUN classification guide.

Step 1 - Locate the public archive + target PDF#

Source used#

GitHub repository: https://github.com/iamcryptoki/snowden-archive

Repository description matches the prompt: “A collection of all documents leaked by former NSA contractor and whistleblower Edward Snowden.”

Commands#

Search repository candidates:

gh search repos snowden --limit 100

bash

Clone likely mirror:

git clone --depth 1 https://github.com/iamcryptoki/snowden-archive /mnt/Nahil/apoorvctf/ai/snowden-archive

bash

Find PDFs on the exact target date:

glob "**/20130905*.pdf" /mnt/Nahil/apoorvctf/ai/snowden-archive

bash

Relevant hits:

documents/2013/20130905-theguardian__sigint_enabling.pdf
documents/2013/20130905-theguardian__cryptanalysis_classification.pdf
documents/2013/20130905-theguardian__bullrun.pdf

The “overarching US encryption defeat program” clue maps to BULLRUN.

Step 2 - Extract Variable X (file-specific latest commit SHA prefix)#

The prompt explicitly says not to use repo HEAD, but the latest commit for the exact PDF file.

Used GitHub API by file path:

gh api "repos/iamcryptoki/snowden-archive/commits?path=documents/2013/20130905-theguardian__bullrun.pdf&per_page=5"

bash

Relevant output field:

sha: 7d88323521194ed8598624dc3a932930debdde1d

So:

Variable X = first 7 chars = 7d88323

Step 3 - Parse PDF and recover second ECI after APERIODIC#

Convert PDF to text and inspect appendix/remarks sections:

pdftotext "/mnt/Nahil/apoorvctf/ai/snowden-archive/documents/2013/20130905-theguardian__bullrun.pdf" -

bash

Important extracted lines:

“Appendix A lists specific BULLRUN capabilities…”
“Related ECIs include, but are not limited to:”
APERIODIC, AMBULANT, AUNTIE, PAINTEDEAGLE, ...

From the ordered list:

first ECI = APERIODIC
second ECI (immediately after) = AMBULANT

Normalize per prompt:

normalized codeword = ambulant

Step 4 - Compute Variable Y using all-MiniLM-L6-v2#

Model requirement#

Prompt requires semantic embedding with all-MiniLM-L6-v2 and:

input = normalized 8-letter codeword (ambulant)
output = embedding[0]
round to 4 decimals

Practical environment note#

sentence-transformers + full torch install failed due disk quota.
Used a lighter runtime (fastembed) that serves the same model family (sentence-transformers/all-MiniLM-L6-v2) and returns the embedding vector directly.

Install:

python3 -m pip install --user fastembed

bash

Compute embedding:

from fastembed import TextEmbedding

model = TextEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
vec = next(model.embed(["ambulant"]))

print(vec[0])
print(round(float(vec[0]), 4))

python

Observed value:

vec[0] = 0.024466823750619482
Rounded 4 dp -> Variable Y = 0.0245

Step 5 - Construct and verify flag#

Using X = 7d88323 and Y = 0.0245:

apoorvctf{7d88323_0.0245}

This was accepted by the platform.

Reproducible End-to-End Script#

#!/usr/bin/env python3
import json
import subprocess
from pathlib import Path

from fastembed import TextEmbedding


PDF_PATH = "documents/2013/20130905-theguardian__bullrun.pdf"
REPO = "iamcryptoki/snowden-archive"


def sh(cmd: list[str]) -> str:
    return subprocess.check_output(cmd, text=True)


def get_variable_x() -> str:
    out = sh([
        "gh",
        "api",
        f"repos/{REPO}/commits?path={PDF_PATH}&per_page=1",
    ])
    data = json.loads(out)
    sha = data[0]["sha"]
    return sha[:7]


def get_second_eci_from_pdf(local_pdf: Path) -> str:
    text = sh(["pdftotext", str(local_pdf), "-"])
    # Find the line that starts with APERIODIC and parse comma-separated ECIs.
    lines = [ln.strip() for ln in text.splitlines() if "APERIODIC" in ln]
    if not lines:
        raise RuntimeError("Could not find ECI line containing APERIODIC")

    # Example segment: APERIODIC, AMBULANT, AUNTIE, ...
    parts = [p.strip() for p in lines[0].replace(".", "").split(",")]
    idx = parts.index("APERIODIC")
    return parts[idx + 1].lower()


def get_variable_y(codeword: str) -> float:
    model = TextEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
    vec = next(model.embed([codeword]))
    return round(float(vec[0]), 4)


def main():
    x = get_variable_x()
    local_pdf = Path("snowden-archive") / PDF_PATH
    eci = get_second_eci_from_pdf(local_pdf)
    y = get_variable_y(eci)
    flag = f"apoorvctf{{{x}_{y:.4f}}}"

    print("X:", x)
    print("ECI:", eci)
    print("Y:", f"{y:.4f}")
    print("FLAG:", flag)


if __name__ == "__main__":
    main()

python

Sources#

Challenge prompt: qn.md
Snowden archive mirror: https://github.com/iamcryptoki/snowden-archive
Target PDF path in mirror: documents/2013/20130905-theguardian__bullrun.pdf
GitHub commits API for file history:
- https://api.github.com/repos/iamcryptoki/snowden-archive/commits?path=documents/2013/20130905-theguardian__bullrun.pdf&per_page=1
Embedding model reference:
- sentence-transformers/all-MiniLM-L6-v2

Mirrorfall Writeup

Project Mirrorfall - Detailed Writeup#

Challenge#

Step 0 - Read the prompt#

Step 1 - Locate the public archive + target PDF#

Source used#

Commands#

Step 2 - Extract Variable X (file-specific latest commit SHA prefix)#

Step 3 - Parse PDF and recover second ECI after APERIODIC#

Step 4 - Compute Variable Y using all-MiniLM-L6-v2#

Model requirement#

Practical environment note#

Step 5 - Construct and verify flag#

Reproducible End-to-End Script#

Sources#