Environmental Reporting Framework — Bangladesh Upazilas

10 data categories & what to collect per upazila

1Geographic & Administrative Identity

Upazila name, district, division; GPS boundary coordinates; area (km²); number of unions, mouzas, villages; population (BBS census); population density; literacy rate; nearest major urban centre and distance. This data anchors every other section — confirm from DLRS mouza maps or CEGIS, not assumptions.

SpatialQuantitative

2Geology, Geomorphology & Soils

Physiographic zone (Tista Fan, Old Brahmaputra Floodplain, Sylhet Basin, Meghna Floodplain, etc.); flood plain type (active/inactive/high); dominant soil series from SRDI survey; soil texture, pH, organic matter content; land type classification on the F0–F3 flood inundation scale; elevation and slope from SRTM DEM data. Name specific soil series where known — do not generalise.

SpatialSemi-quantitative

3Land Use Classification (Current)

Zone-wise area breakdown: agricultural (irrigated/rain-fed), homestead/settlement, water bodies, wetlands, forest/plantation, social infrastructure, industrial, char land, cultural heritage zones. Use provided zoning data or RS imagery (Sentinel-2, Landsat 9). Calculate percentage of total area per zone. Critically: comment on what is absent, surprisingly small, or disproportionately large relative to the upazila's known environmental character.

SpatialQuantitative

4Hydrology & Water Bodies

Named rivers, tributaries, beels, haors, baors, canals with specific local names — never default to a regional river if a more locally accurate identification is possible. Water body area (seasonal and permanent); river discharge from BWDB gauge stations; tidal influence zone; flood frequency and inundation depth class; drainage network connectivity; groundwater depth from DPHE/BWDB records; arsenic contamination status from DPHE maps.

SpatialQuantitativeTemporal

5Disaster Vulnerability & Climate

Historical flood years and severity (FFWC); river erosion hotspot unions (BWDB erosion index); cyclone/tornado/nor'wester/hailstorm frequency and documented impact years; drought occurrence by type (pre-monsoon Rabi/Pre-Kharif; post-monsoon Kharif); Kalbaishakhi seasonal storm damage years; rainfall annual mean and distribution (BMD); temperature range; siltation as a chronic problem distinct from acute flood events; water logging duration per year. Identify which specific unions are disproportionately vulnerable to which specific hazards.

RiskTemporal

6Biodiversity & Ecology

Named fish species (resident + migratory); bird species (resident + migratory); mammals, reptiles, amphibians recorded; plant species of homestead forest; presence of IUCN Red List threatened species; documented fish production from open water capture fisheries; historical records of species now locally extinct — this last category is often obtainable only from community oral history and FGD sessions with long-resident farmers and fishers.

QualitativeSemi-quantitative

7Agricultural Profile

Cropping intensity and pattern (Boro/Aman/Aus area from DAE block data); major crops and seasonal calendar; fertiliser use per hectare; pesticide type and application rate; irrigation coverage (surface/groundwater %); yield trends from BBS crop statistics; documented fish kill events linked to agrochemical runoff. The 25% agrochemical runoff contamination rate into soil and water is a well-documented national-level figure for Bangladesh (Bangladesh Environment Conservation Act 1995 enforcement context).

Quantitative

8Environmental Pollution

Number and location of brick fields; proximity of brick fields to agricultural land; type of land affected (fertile double/triple-crop land is the critical concern); air pollution types (smoke, dust, heat); health impacts on surrounding communities (chest/lung disease); domestic solid waste management status; industrial effluent sources; water quality (DO, BOD, turbidity) from DoE/DPHE if available; enforcement records from DoE district office. Topsoil extraction for brickmaking is a distinct threat requiring separate treatment from brick field air pollution.

RiskSemi-quantitative

9Socioeconomic & Infrastructure

Total population, literacy rate, per capita income (BBS); main livelihoods and occupational breakdown; road network length (paved/earthen) from LGED; market/hat bazaar locations; school and health facility count; electricity and water supply coverage; migration trends linked to environmental stress. Critically: road construction without adequate drainage culverts is a primary driver of water logging — document specific unplanned road interventions where known.

QuantitativeSemi-quantitative

10Legal & ECA/ESA Status

Whether any part of the upazila falls within a DoE-designated Ecologically Critical Area (ECA). If no official designation exists: identify the most ecologically significant local wetland, beel, or river system by name, treat it as a de facto ecologically sensitive area, describe its biodiversity and threat profile in detail, and argue for formal recognition under existing law. Key applicable laws: Bangladesh Environment Conservation Act 1995; Bangladesh Environment Conservation Rules 1997; ECA Notification SRO No. 210, 1999; Wildlife (Conservation and Security) Act 2012; Wetland Policy 1998.

Legal

Primary institutional sources — Bangladesh context

Institution	Data Available	Data Type
BBS bbs.gov.bd	Population census (2011, 2022), agricultural census, upazila statistics yearbook, crop area and production data, occupational breakdown, literacy, per capita income	Statistical
SRDI srdi.gov.bd	Soil Resource Development Institute — soil survey reports, soil series maps, land capability classification by upazila, soil pH and organic matter data, F0–F3 land type maps	Spatial / Soil
SPARRSO sparrso.gov.bd	Satellite imagery (SPOT, Resourcesat), land use/land cover maps, flood mapping (annual), vegetation indices (NDVI), char dynamics, water body area mapping	Remote Sensing
BWDB bwdb.gov.bd	River gauge and discharge data, flood frequency analysis, erosion and accretion index maps, drainage master plans, haor/wetland surveys, groundwater data, beel area records	Hydrology
DoE Bangladesh doe.gov.bd	ECA designations and gazette notifications, environmental clearance records, pollution monitoring data (air/water quality), enforcement case history, brick field licensing records	Regulatory
FFWC ffwc.gov.bd	Flood Forecasting and Warning Centre — historical flood years, inundation depth and duration per district, real-time and archive flood maps by upazila	Flood / Risk
DAE dae.gov.bd	Upazila-level crop calendar, Boro/Aman/Aus area data, fertiliser and pesticide use per block, agricultural extension records, seasonal crop damage reports	Agricultural
BMD bmd.gov.bd	Bangladesh Meteorological Department — rainfall normals (monthly/annual), temperature records, relative humidity, wind speed data, cyclone and nor'wester track records by district	Climate
DPHE dphe.gov.bd	Groundwater depth and quality data, arsenic contamination maps (union-level), safe drinking water coverage, tube-well monitoring records	Water Quality
LGED lged.gov.bd	Rural road network (paved/earthen km), drainage infrastructure records, upazila and union council infrastructure maps, culvert inventory	Infrastructure
DoF fisheries.gov.bd	Department of Fisheries — open water fisheries production data (upazila-level), beel and haor stocking records, fish landing centre data, fisheries diversity surveys	Biodiversity
CEGIS cegis.org	Climate vulnerability assessment reports, GIS datasets, floodplain and char dynamics maps, delta modelling outputs, environmental impact assessment support data	Spatial / Climate
DLRS / MoLand minland.gov.bd	Mouza maps, RS cadastral records, land classification (khas land, vested property), settlement records, upazila boundary polygons	Land Records
IUCN Bangladesh iucn.org/bd	Red List assessments for Bangladesh species (Volumes 1–7, 2015), threatened species distribution by division/district, habitat profiles for wetland and forest species	Biodiversity
Sentinel Hub / GEE Free access	Google Earth Engine and Copernicus Sentinel-2 / Landsat 9 imagery — free multitemporal satellite imagery for NDWI water body mapping, NDVI vegetation, LULC change detection, flood extent mapping	Remote Sensing
Upazila Parishad Local	Local disaster records, brick field locations, community knowledge from UP chairmen and members, informal land use information, primary FGD data source	Primary / Local

How to analyse each data type — methods & outputs

ALand Use Change Analysis

Overlay two-date LULC maps (e.g., SPARRSO 2010 vs Sentinel-2 2024). Calculate net change per zone in hectares and percentage. Use QGIS or ArcGIS. Key metrics: agricultural land loss rate per year, water body area reduction, settlement expansion index. A transition matrix shows which zones converted to what — for example, how much agricultural land became settlement or brick field between two dates. This is the single most important analysis for a land zoning report.

GIS — QGIS / ArcGISQuantitative output

BFlood Vulnerability Analysis

Cross-reference FFWC inundation data with BWDB land type classification (F0=non-flooded, F1=shallow, F2=medium, F3=deeply flooded). Map union-level flood frequency from historical records. Overlay with population density and crop land to produce a socio-ecological vulnerability score per union. Identify the worst-affected unions by name — this is what distinguishes a specific, credible report from a generic one.

GIS overlayRisk scoring

CWater Body Depletion Assessment

Use NDWI (Normalized Difference Water Index = (Green − NIR) / (Green + NIR)) from Landsat/Sentinel time series to track water body area change over 10–20 years. Run this in Google Earth Engine (free) using the upazila boundary polygon as the region of interest. Cross-validate with BWDB beel and haor records. Calculate the annual rate of wetland loss. Identify siltation-affected khals and beels by location. This analysis directly supports the ECA/ESA argument in the report.

Remote sensingTime-series

DSoil Quality & Agricultural Land Profiling

Map SRDI soil series for the upazila. Cross with land type classification and crop intensity data (double/triple-crop areas). Identify prime agricultural soils and flag areas at conversion risk. Agrochemical load is estimated from DAE fertiliser use data and converted to kg/ha per season. Cross this with DoF fish kill event records and beel/river locations to establish proximity between agrochemical use and aquatic mortality events.

QuantitativeCross-source

EBiodiversity Status Assessment

Compile a species inventory from secondary literature, DoF records, IUCN Bangladesh Red List (Volumes 1–7), and primary FGD sessions with local fishers, farmers, and UP members. Classify species by IUCN status: Least Concern, Near Threatened, Vulnerable, Endangered, Critically Endangered. Document locally extinct species based on community oral history — this is often the only source for historically present species. No statistical modelling required; systematic documentation and classification is the output.

Qualitative inventoryIUCN classification

FECA / ESA Suitability Analysis

Apply DoE ECA criteria from the 1999 SRO No. 210 notification: biodiversity significance, migratory bird presence, fish spawning and nursery grounds, threatened ecosystem type, ecological connectivity value. Score major wetlands and beels against these criteria. If a specific beel or haor meets the threshold but has no formal designation, document the scoring evidence and argue explicitly for designation under the Bangladesh Environment Conservation Rules 1997. This is a formal legal argument, not a vague recommendation.

Legal analysisCriteria scoring

GBrick Field Impact Analysis

Map all brick field locations from satellite imagery or field survey. Draw 500 m and 1 km buffer zones around each. Calculate agricultural land area within buffers that is affected by topsoil extraction and air pollution fallout. Cross with SRDI fertile soil zones to identify the quality of land being destroyed. This produces a localised pressure index on prime agricultural land. Also cross-reference with school, health facility, and settlement locations within the air pollution buffer for health impact assessment.

GIS buffer analysisPressure index

HClimate Risk Narrative Synthesis

Use BMD rainfall trend data, FFWC historical flood records, and BBS crop loss statistics together. Identify years with compound disasters — flood compounded by storm, or drought following flood damage. Write a qualitative narrative identifying which unions face the highest multi-hazard exposure. Also incorporate climate change projections for the district from BCAS or CEGIS reports. This does not require modelling — documentary analysis of existing records suffices for an upazila-level report.

Temporal synthesisMulti-hazard

Step-by-step report production workflow — one upazila

Define upazila scope & confirm identity

Collect administrative identity: district, division, union list, total area, population. Confirm boundary from DLRS mouza maps or CEGIS polygon data. Get GPS centroid coordinates. Critically: confirm the primary river(s), beels, and haors by cross-referencing BWDB records and SPARRSO maps — do not assume the district-level river is the primary local river. This verification step prevents the most common geographic error in upazila reports.

DLRS / CEGISBBS

Compile secondary data from institutions

Pull BBS census data, SRDI soil reports, BMD climate normals, DAE crop area, DoF fisheries data, BWDB flood and erosion records, DPHE groundwater maps. Create a structured data file per upazila (JSON or spreadsheet) using a consistent schema with defined field names and values. Flag every field where data is unavailable or estimated — these become your "আনুমানিক তথ্য অনুযায়ী" notes in the final report.

All BD agencies2–3 days

Process land use / zoning data

Input provided zoning data (zone name, area in acres, percentage of total). Calculate totals. Identify dominant zones, critically small zones, and zones that are absent entirely. Calculate ratios: agriculture vs settlement; water body vs upazila total; forest vs settled area. Draft analytical commentary — this critical interpretation is what distinguishes a substantive report from a simple data recitation and is required by the report prompt.

Spreadsheet / Python

GIS analysis (where maps are available)

Using QGIS: overlay land use zones, SRDI soil map, BWDB flood map, and brick field locations. Produce buffer analysis around water bodies and brick fields. Calculate NDWI change from Sentinel-2 imagery using Google Earth Engine (free). Run the upazila boundary as the region of interest, extract water body area for two time periods (e.g. 2010 and 2024), and calculate the change. Output summary statistics per zone. This step can be skipped if maps are genuinely unavailable — note the limitation in the report.

QGIS / GEE3–5 hours

Primary data collection — field / consultation

FGD (Focus Group Discussion) with farmers, fishers, and UP members at union level. Document local knowledge on species loss, flood years, water logging duration, brick field locations and impacts. Record oral history of environmental change. Use a structured questionnaire with consistent fields across upazilas for comparability. This is the only source for locally extinct species, recent fish kill events, and the lived impact of water logging on agriculture. Cannot be replaced by secondary data.

Field work1–2 days on-site

Build the structured data file (JSON / YAML)

Compile all collected data into a single JSON or YAML file per upazila using the master schema. Every field has a defined key. Fields with no data have a null value and a reason string in a notes section. This file becomes the single source of truth for report generation — if a number is not in this file, it cannot appear in the generated report. This schema also serves as the audit trail for QC verification.

Master schemaJSON / YAML

Generate Bangla report (AI-assisted via Claude API)

Feed the structured data file and land use data into the report prompt template. Use the Claude API (claude-sonnet-4) with the formal Bangla environmental report prompt. The prompt injects all location-specific variables from the data file. Review the output immediately: any number in the report that does not appear in the data file is a potential hallucination and must be flagged before the QC step.

Claude API15–30 minutes

Automated QC check

Run the QC checker script against the generated report and its source data file. The checker verifies: word count (1,500–2,000), all numbers traceable to the data file, no unexpected English words, no headings/bullets/tables in the prose, no excessive placeholder citations, and upazila/district name consistency. Fix every flagged issue before proceeding to expert review. A report that fails QC must not go to the expert — that wastes expert time on avoidable errors.

QC scriptAutomated

Expert review, citation check & final submission

A subject-matter expert familiar with that district reviews the QC-passed draft. They verify qualitative accuracy: species names, soil series names, river names, and historical flood events specific to that upazila. All citations are verified against the actual source documents. Unverifiable citations are replaced with [তথ্যসূত্র যাচাই প্রয়োজন]. Final Bangla text is formatted in Word/PDF with government report conventions. Archive the data JSON file for future update cycles.

Expert review2–4 hours

Estimated time per report

Secondary data compilation

4–6 hrs

GIS analysis (if maps available)

3–5 hrs

Field consultation (FGD)

1–2 days

Data file assembly

1–2 hrs

AI report generation

15–30 min

QC check (automated)

5–10 min

Expert review + edit

2–4 hrs

Total per upazila

2–4 days

Scale note: A district with 12 upazilas can be processed in 4–6 weeks if secondary data compilation is parallelised across team members and field visits are batched by geography. The AI generation and QC steps are effectively instantaneous at scale once the data files are ready.

Automation strategy for batch upazila report production

Core principle: Environmental reports at upazila level share ~80% of their structure — only the specific local data changes. The right investment is not in writing each report faster, but in building a data pipeline once that makes every subsequent report faster, more consistent, and more verifiable.

AStep A — Build the master data schema (JSON)

Design a single JSON schema that covers all data fields needed across any upazila in Bangladesh. Each upazila gets its own .json file using this schema. Fields with no data are set to null with a reason string in a notes section. This schema is the backbone of the entire system — never skip defining it properly.

{
  "administrative": { "upazila": "Bijoynagar", "district": "Brahmanbaria", ... },
  "hydrology":      { "primary_river": "Titas", "named_beels": [...], ... },
  "disasters":      { "flood": { "major_flood_years": [1974,1988,1998,2004] }, ... },
  "land_use":       { "data_year": 2026, "zones": [...] },
  "eca_status":     { "officially_designated_eca": false, ... },
  "citations":      { "bbs": "BBS 2022...", "srdi": null },
  "_notes":         { "null_reasons": { "srdi": "Not yet retrieved" } }
}

BStep B — Build the master prompt template

Create a prompt template where all location-specific variables are placeholders: {UPAZILA_NAME}, {DISTRICT}, {PRIMARY_RIVER}, {FLOOD_YEARS}, {LAND_USE_DATA}, etc. The template stays fixed across all upazilas and districts. Only the data injected into it changes per report. Version-control the template — any change to the prompt affects every future report.

CStep C — Python script: data file → filled prompt → API call → output

A Python script reads each upazila .json file, fills in the template placeholders, sends the complete prompt to the Claude API, and saves the output as a .txt and .docx file. Run in a loop for all upazilas in a district with rate-limit delay between calls.

# Install: pip install anthropic python-docx
import anthropic, json
client = anthropic.Anthropic(api_key="your-key")

# Load upazila data
with open("bijoynagar.json") as f:
    data = json.load(f)

# Build and send prompt
prompt = TEMPLATE.format(
    LOCATION  = data["administrative"]["upazila"],
    DISTRICT  = data["administrative"]["district"],
    LANDUSE   = json.dumps(data["land_use"]["zones"], ensure_ascii=False),
    RIVERS    = data["hydrology"]["primary_river"],
    FLOOD_YRS = ", ".join(str(y) for y in
                data["disasters"]["flood"]["major_flood_years"])
)
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}]
)

# Save output
report = message.content[0].text
open(f"{data['administrative']['upazila']}_report.txt","w",
     encoding="utf-8").write(report)

DStep D — GIS automation (QGIS Python / Google Earth Engine)

Use PyQGIS (the Python API inside QGIS) or Google Earth Engine Python API to auto-generate standard analysis outputs for each upazila by bounding box: NDWI water body area for two dates, NDVI vegetation cover, LULC classification, flood zone overlay. GEE is free and requires only a Google account. A single GEE script can run for all upazilas in a district sequentially, outputting a CSV of summary statistics per upazila that feeds directly into the data schema files.

# Google Earth Engine — NDWI water body change (example)
import ee
ee.Initialize()

upazila_boundary = ee.FeatureCollection("your-asset-path")

s2_2010 = ee.ImageCollection("LANDSAT/LT05/C02/T1_L2") \
    .filterBounds(upazila_boundary) \
    .filterDate("2010-01-01","2010-12-31") \
    .median()

s2_2024 = ee.ImageCollection("COPERNICUS/S2_SR") \
    .filterBounds(upazila_boundary) \
    .filterDate("2024-01-01","2024-12-31") \
    .median()

# NDWI = (Green - NIR) / (Green + NIR)
ndwi_2010 = s2_2010.normalizedDifference(["SR_B2","SR_B5"])
ndwi_2024 = s2_2024.normalizedDifference(["B3","B8"])
water_2010 = ndwi_2010.gt(0).selfMask()
water_2024 = ndwi_2024.gt(0).selfMask()
# Then compute area statistics per upazila...

EStep E — Automated Word document assembly (python-docx)

Use the python-docx library to auto-insert the generated Bangla text into a pre-designed Word template with the government report header, proper page margins (1.2 in top/bottom, 1.3 in left/right), Bangla-compatible font (SutonnyMJ or Kalpurush), page numbering, and section spacing. Export to PDF using LibreOffice headless mode. No manual formatting needed per report.

# Install: pip install python-docx
from docx import Document
from docx.shared import Pt, Inches

doc = Document("govt_letterhead_template.docx")
doc.add_heading(f"{upazila} উপজেলা — পরিবেশ প্রতিবেদন", level=1)
for paragraph in report_text.split("\n\n"):
    p = doc.add_paragraph(paragraph.strip())
    p.style.font.size = Pt(12)
doc.save(f"{upazila}_env_report.docx")

# Convert to PDF (requires LibreOffice installed)
import subprocess
subprocess.run(["libreoffice","--headless","--convert-to","pdf",
                f"{upazila}_env_report.docx"])

FStep F — Automated QC checker

A Python QC script reads the generated report and its source data file and runs seven checks: (1) word count 1,500–2,000; (2) all numbers in report traceable to data file — any number not in the data file is flagged as a potential hallucination; (3) no unexpected English words in Bangla text; (4) no headings/bullets/tables in prose; (5) count of [তথ্যসূত্র যাচাই প্রয়োজন] placeholders; (6) upazila and district names present; (7) null fields noted in data file. Outputs a machine-readable QC JSON report plus console summary.

# Run QC on a single report
python report_qc_checker.py \
    --report  ./reports/Bijoynagar_report.txt \
    --data    ./data/bijoynagar.json \
    --save-json

# Run QC on an entire district batch
python report_qc_checker.py \
    --reports-dir ./reports \
    --data-dir    ./data \
    --save-json

# Output example:
# ✓  [word_count]            1,847 words — OK
# ✗  [hallucination_guard]   3 number(s) not in source data
# ✓  [forbidden_structure]   prose-only format maintained
# ✓  [name_consistency]      OK

Critical risk: Claude may still produce plausible-sounding but unverifiable qualitative statements even when all numbers are correct. The QC checker catches numerical hallucinations — but expert review (Workflow Step 9) remains essential for qualitative accuracy: species names, river names, soil series, and historical events specific to that upazila.

Full automation tools summary

Tool	Purpose	Cost
Claude API claude-sonnet-4	Bangla report generation from structured prompt + data file. Primary generation engine.	Paid per token
Python anthropic SDK	Batch API calls, prompt template filling, output file saving, rate-limit management, batch logging.	Free
QGIS PyQGIS	GIS overlay analysis, buffer zones, land use change maps, spatial statistics. Desktop GIS.	Free
Google Earth Engine Python API	NDWI water body mapping, NDVI, LULC change, flood extent — cloud GIS at upazila scale.	Free (non-commercial)
python-docx pip install	Auto-assemble Bangla report text into a formatted Word document with government report template styling.	Free
LibreOffice headless mode	Convert .docx to PDF automatically via command line. No Microsoft Office required.	Free
JSON / YAML files schema-based	Per-upazila structured data store. No database needed. Version-controllable with Git.	Free
Pandas / openpyxl pip install	Land use data analysis, zone-wise statistics, percentage calculations, transition matrix construction.	Free
Git version control	Track changes to data files and prompt template across the project. Essential for a multi-upazila, multi-person project.	Free

Realistic scale estimate: Once the master JSON schema and prompt template are built (one-time investment of ~2–3 days), and secondary data for an upazila is compiled into the schema (~4–6 hours per upazila), the AI generation + QC step adds only ~30 minutes per upazila. A team of three people can produce verified, QC-passed drafts for all upazilas in a district (typically 8–14 upazilas) within 3–4 weeks, leaving expert review as the primary time constraint.