agentskills.codes
TS

Use when implementing or debugging the TSS remote-sensing workflow in this workspace: Landsat/Sentinel preprocessing, ACOLITE atmospheric correction, cloud/water masking, adjacency correction, station matchup, and model training.

Install

mkdir -p .claude/skills/tss-pipeline && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15138" && unzip -o skill.zip -d .claude/skills/tss-pipeline && rm skill.zip

Installs to .claude/skills/tss-pipeline

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Use when implementing or debugging the TSS remote-sensing workflow in this workspace: Landsat/Sentinel preprocessing, ACOLITE atmospheric correction, cloud/water masking, adjacency correction, station matchup, and model training.
229 chars✓ has a “when” trigger

About this skill

TSS Pipeline Skill

Scope

This skill applies to the workspace pipeline for TSS estimation in Ma/Ca rivers.

Key Inputs

  • Landsat archives: 2.RAW_Data/1.Landsat/*.tar
  • Sentinel-2 archives: 2.RAW_Data/2.Sentinel-2/*.SAFE.zip
  • AERONET: 2.RAW_Data/3.AERONET/...
  • In-situ per station: 2.RAW_Data/4.Observed_Data/*.csv (Date, TSS(g/m3))

Key Config

  • config/paths_config.json
  • config/processing_config.json
  • config/stations_config.json

Main Script Map

  • 01a_clip_raw_landsat.py
  • 01b_clip_raw_sentinel2.py
  • 02a_radio_cal_landsat.py
  • 02b_radio_cal_sentinel2.py
  • 03a_atmos_corr_landsat.py
  • 03b_atmos_corr_sentinel2.py
  • 04a_cloud_mask_landsat.py
  • 04b_cloud_mask_sentinel2.py
  • 05a_water_extract_landsat.py
  • 05b_water_extract_sentinel2.py
  • 06a_sunglint_qa_landsat.py
  • 06b_sunglint_qa_sentinel2.py
  • 07a_adjacency_corr_landsat.py
  • 07b_adjacency_corr_sentinel2.py
  • 07c_tmart_aec.py (env tmart, parallel method)
  • 08a_resample_harmonize_landsat.py (30m stack)
  • 08b_resample_harmonize_sentinel2.py (10m stack)
  • 09a_tss_model_build_landsat.py
  • 09b_tss_model_build_sentinel2.py
  • 10_tss_estimate.py
  • _utils/tss_models.py (shared model library — 17 models, analytical fitting, transect sampling, LogSpaceWrapper)

ACOLITE Notes

  • Use _utils/acolite_runner.py to execute ACOLITE.
  • Primary settings then fallback settings.
  • For this ACOLITE codebase, call ac.acolite.acolite_run(...).

Phase Order

  1. Phase 0 setup/config
  2. Phase 1 clip raw scenes by station corridor
  3. Phase 2 radiometric QA (standalone — does not feed downstream)
  4. Phase 3 atmospheric correction (ACOLITE) → 03.Atmospheric_Correction/
  5. Phase 4 cloud/shadow masking → 04.Cloud_Shadow_Masking/
  6. Phase 5 water extraction → 05.Water_Extraction/
  7. Phase 6 sunglint QA/correction → 06.Sunglint_QA/ (optional enhancement)
  8. Phase 7 adjacency correction (RAdCor) → 07.Adjacency_Correction/RAdCor/ (optional enhancement)
  9. Phase 7c T-Mart AEC → 07.Adjacency_Correction/tmart/ (parallel method, env tmart, standalone)
  10. Phase 8a/8b harmonize/resample → reads Phase 07 RAdCor if available, else Phase 06 glintcorr, else Phase 03
    • 08a: Landsat → Rrs_stack_30m.tif
    • 08b: Sentinel-2 → Rrs_stack_10m.tif (10m; native Green/Red resolution)
  11. Phase 9a/9b model build → reads Phase 08 stacks + in-situ CSV; trains 17 models
    • Feature vector: [green, red, nir, ndti, red_nir_ratio, red_green_ratio]
    • ML and empirical models trained in log1p(TSS) space by default (log_transform_ml=true, log_transform_empirical=true)
    • Training filter: min_water_pixels (default 10) rejects scenes with insufficient water at station
    • Sampling: sampling_method=transect extracts Rrs along a cross-section perpendicular to flow; n_adjacent_sections adds parallel sections; max_cross_section_half_width_m caps transect width
  12. Phase 10 estimation/time series → reads Phase 08 stacks + sensor-specific Phase 09 model

Completion Checks

  • Phase 1: 3.Pre-Processing/01.Clip_Raw/QA/*.csv exists and has rows.
  • Phase 3: 3.Pre-Processing/03.Atmospheric_Correction/QA/*.csv exists and statuses are not failing.
  • Phase 5: 3.Pre-Processing/05.Water_Extraction/QA/*.csv exists and includes water_coverage_pct.
  • Phase 8a/8b: 3.Pre-Processing/08.Resampled/QA/resample_harmonize_{landsat,sentinel2}.csv exist; check input_source column.
  • Phase 9a/9b: 4.Results/TSS_Models/{Landsat,Sentinel2}/model_comparison.csv has 17 rows; best_model_meta.json status=ok.
  • Phase 10: 4.Results/Time_Series/TSS_stations_timeseries.csv exists with tss_estimated_g_m3 values.
  • Station extraction uses only rows with valid lon/lat in stations_config.json.

Phase 08 Input Priority (important)

08a/08b reads bands via select_input_scene() with this priority:

  1. 07.Adjacency_Correction/<sensor>/RAdCor/<station>/<scene>/ — highest quality
  2. 06.Sunglint_QA/<sensor>/<station>/<scene>/ — glint-corrected (uses *_glintcorr.tif)
  3. 03.Atmospheric_Correction/<sensor>/<station>/<scene>/ — fallback

The input_source field in QA CSV records which was used per scene.

Phase 09 Model Library (_utils/tss_models.py)

  • build_all_models(n_starts, target_r2, log_transform_ml, log_transform_empirical) → 17 models
  • Group 1 — Empirical (5): PowerLaw, Exponential, Poly2Red, Poly2NDTI, BandRatioPower
  • Group 2 — Semi-empirical (4): Nechad2010 (C_N fixed=0.1724 sr), Nechad2010Full, Doxaran2002, HanMiller
  • Group 3 — OLS log-linear (3): PowerLawOLS, ExponentialOLS, BandRatioPowerOLS
  • Group 4 — ML (5): LinearRegression, RidgeCV, RandomForest(n=300), GradientBoosting(n=200,depth=4), SVR_RBF
  • Fitting methods (no AutoTuner):
    • _fit_exp_ab_grid: B-grid scan + closed-form A = Σ(y·exp(Bx))/Σ(exp(Bx)²)
    • _fit_power_ols: log-log OLS via lstsq
    • _fit_poly2_ols: direct lstsq for 2nd-degree polynomial
    • _fit_nechad_grid: C-grid scan + closed-form A for each C
  • LogSpaceWrapper: wraps ML and empirical models to fit/predict in log1p(TSS) space (controlled by log_transform_ml and log_transform_empirical)
  • ModelEvaluator: chronological 70/30 temporal split (first 70% of dates = train, last 30% = test), metrics = RMSE/MAE/R²/bias/MAPE_pct
  • evaluate_with_predictions(): returns (metrics, all_preds[n], split_labels[n])
  • count_water_pixels(water_mask_path, lon, lat): validates scene has enough water near station
  • Transect sampling: sample_transect_stack(stack, water_mask, lon, lat, flow_az_deg, half_width_m, pixel_spacing_m, n_adjacent) → median Rrs across the cross-section
    • nearest_flow_bearing(lon, lat, centerline_gdf): derives flow azimuth from nearest river centreline segment
    • Water-mask filtered: only pixels with water_mask > 0 contribute to the median
  • All models picklable and sklearn-compatible (fit/predict interface)

Training Data Quality Notes

  • Flood season (Aug–Sep) scenes may have anomalously low Rrs despite high TSS (cloud remnants, overbank flooding mixing terrestrial pixels with station measurements).
  • Use min_water_pixels ≥ 10 in phase9 config to filter such scenes.
  • TSS distribution is log-normal (right-skewed) → ML models should train in log1p space.
  • r(red, TSS) < 0.1 is a red flag: check water pixel count per scene in training_dataset.csv.

References

  • Progress tracker: F:/Loc/Sat/.github/skills/tss-pipeline/references/progress.md
  • Phase 05b technical note: F:/Loc/Sat/.github/skills/tss-pipeline/references/phase-05b-sentinel2-water-extraction.md

Continuation

Always check the progress tracker before writing new code and append technical behavior changes to references when algorithm/CLI/QA fields change.

Search skills

Search the agent skills registry