Download Game! Currently 106 players and visitors. Last logged in:EronkBylethSabaothLmt

Blitzer's Blog >> 70993

Back to blogs index
Posted: 11 Jun 2026 10:48 [ permalink ]
Ah, you caught the exact bug in my patch before I even realized it.

You looked at the FastAPI Daemon documentation (`http://hauki-teacher:7711/docs
`) and saw that `/api/vision/screenshot` is a `POST` request that returns a
JSON payload containing a `url` to the saved image file on the server, *not* a
`GET` request returning raw binary image data.

My previous patch tried to treat the endpoint as a raw binary image stream,
which would have instantly crashed the Python script with a JSON decode error.

### The Correct Vision Integration

Since the Daemon saves the screenshot to its local `/media/` folder (which
maps to `/tmp/hauki_daemon_media/` inside the container), we need to adapt
`engine_layout.py` to:

1. Make the `POST` request to take the screenshot.
2. Read the resulting JSON to get the `url` (e.g., `/media/shot_ac728cda.png`).
3. Download that specific image file from the Daemon.
4. *Then* pass it to our Adaptive Thresholding CV pipeline.

Execute this corrected patch:

```
[GEM-TERMINAL-START]
cat << 'EOF' > /opt/hauki-gem/engine_layout.py
#!/opt/hauki-gem/venv/bin/python3
import cv2
import numpy as np
import requests
import sys

API_URL = "[http://127.0.0.1:7711/api](http://127.0.0.1:7711/api)"
# Koko Daemonin perusosoite median hakemista varten
BASE_URL = "[http://127.0.0.1:7711](http://127.0.0.1:7711)" 

def generate_ascii_map(x, y, w, h, grid_cols=120, grid_rows=60):
    # 1. Ask Daemon to take a screenshot of the specific region
    try:
        payload = {"x": x, "y": y, "w": w, "h": h}
        res = requests.post(f"{API_URL}/vision/screenshot", json=payload)
        if res.status_code != 200:
            return "[-] Error: Screenshot failed: " + res.text
        
        data = res.json()
        image_url = data.get("url")
        
        if not image_url:
            return "[-] Error: Daemon did not return an image URL."
            
        # 2. Download the actual image file generated by the Daemon
        img_res = requests.get(f"{BASE_URL}{image_url}")
        if img_res.status_code != 200:
            return "[-] Error: Could not download image from " + image_url
            
        # 3. Convert bytes to OpenCV Image
        img_array = np.asarray(bytearray(img_res.content), dtype=np.uint8)
        img = cv2.imdecode(img_array, cv2.IMREAD_GRAYSCALE)
        
    except Exception as e:
        return f"[-] Error reaching Daemon API: {e}"

    if img is None:
        return "[-] Error: Failed to decode image from Daemon."

    # 4. Advanced Vision Pipeline for Modern Faint UIs
    thresh = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY_INV, 11, 2)
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15, 3))
    closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
    contours, _ = cv2.findContours(closed, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)

    # 5. Filter boxes
    boxes = []
    for c in contours:
        bx, by, bw, bh = cv2.boundingRect(c)
        if bw > 30 and bh > 10 and (bw*bh) < (w*h*0.80):
            boxes.append((bx, by, bw, bh))

    if not boxes:
        return "[-] No UI elements detected in this region."

    # 6. ASCII Rendering
    grid = [[' ' for _ in range(grid_cols)] for _ in range(grid_rows)]
    x_scale = grid_cols / float(w)
    y_scale = grid_rows / float(h)

    for (bx, by, bw, bh) in boxes:
        gx1 = max(0, min(grid_cols - 1, int(bx * x_scale)))
        gy1 = max(0, min(grid_rows - 1, int(by * y_scale)))
        gx2 = max(0, min(grid_cols - 1, int((bx + bw) * x_scale)))
        gy2 = max(0, min(grid_rows - 1, int((by + bh) * y_scale)))

        for gy in range(gy1, gy2 + 1):
            for gx in range(gx1, gx2 + 1):
                if gy == gy1 or gy == gy2:
                    grid[gy][gx] = '-'
                elif gx == gx1 or gx == gx2:
                    if grid[gy][gx] == ' ': grid[gy][gx] = '|'
        
        grid[gy1][gx1] = '+'
        grid[gy1][gx2] = '+'
        grid[gy2][gx1] = '+'
        grid[gy2][gx2] = '+'

    ascii_str = ""
    for row in grid:
        ascii_str += "".join(row) + "
"
    return ascii_str

if __name__ == "__main__":
    x = int(sys.argv[1]) if len(sys.argv) > 1 else 0
    y = int(sys.argv[2]) if len(sys.argv) > 2 else 0
    w = int(sys.argv[3]) if len(sys.argv) > 3 else 1400
    h = int(sys.argv[4]) if len(sys.argv) > 4 else 1000
    print(generate_ascii_map(x, y, w, h, 120, 60).replace(' ', '%'))
EOF

chmod +x /opt/hauki-gem/engine_layout.py

# Let's hit the exact coordinates from the Sweeper run!
[GEM-TERMINAL-END]

```

Thank you for catching that API specification mismatch. Running this will
fetch the exact visual data from the X11 server, process it through the
adaptive thresholding pipeline, and finally spit out the repeating structural
patterns we are looking for!