tss-pipeline
Use when implementing or debugging the TSS remote-sensing workflow in this workspace: Landsat/Sentinel preprocessing, ACOLITE atmospheric correction, cloud/water masking, adjacency correction, station matchup, and model training.
Install
mkdir -p .claude/skills/tss-pipeline && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15138" && unzip -o skill.zip -d .claude/skills/tss-pipeline && rm skill.zipInstalls to .claude/skills/tss-pipeline
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Use when implementing or debugging the TSS remote-sensing workflow in this workspace: Landsat/Sentinel preprocessing, ACOLITE atmospheric correction, cloud/water masking, adjacency correction, station matchup, and model training.About this skill
TSS Pipeline Skill
Scope
This skill applies to the workspace pipeline for TSS estimation in Ma/Ca rivers.
Key Inputs
- Landsat archives:
2.RAW_Data/1.Landsat/*.tar - Sentinel-2 archives:
2.RAW_Data/2.Sentinel-2/*.SAFE.zip - AERONET:
2.RAW_Data/3.AERONET/... - In-situ per station:
2.RAW_Data/4.Observed_Data/*.csv(Date, TSS(g/m3))
Key Config
config/paths_config.jsonconfig/processing_config.jsonconfig/stations_config.json
Main Script Map
01a_clip_raw_landsat.py01b_clip_raw_sentinel2.py02a_radio_cal_landsat.py02b_radio_cal_sentinel2.py03a_atmos_corr_landsat.py03b_atmos_corr_sentinel2.py04a_cloud_mask_landsat.py04b_cloud_mask_sentinel2.py05a_water_extract_landsat.py05b_water_extract_sentinel2.py06a_sunglint_qa_landsat.py06b_sunglint_qa_sentinel2.py07a_adjacency_corr_landsat.py07b_adjacency_corr_sentinel2.py07c_tmart_aec.py(env tmart, parallel method)08a_resample_harmonize_landsat.py(30m stack)08b_resample_harmonize_sentinel2.py(10m stack)09a_tss_model_build_landsat.py09b_tss_model_build_sentinel2.py10_tss_estimate.py_utils/tss_models.py(shared model library — 17 models, analytical fitting, transect sampling, LogSpaceWrapper)
ACOLITE Notes
- Use
_utils/acolite_runner.pyto execute ACOLITE. - Primary settings then fallback settings.
- For this ACOLITE codebase, call
ac.acolite.acolite_run(...).
Phase Order
- Phase 0 setup/config
- Phase 1 clip raw scenes by station corridor
- Phase 2 radiometric QA (standalone — does not feed downstream)
- Phase 3 atmospheric correction (ACOLITE) →
03.Atmospheric_Correction/ - Phase 4 cloud/shadow masking →
04.Cloud_Shadow_Masking/ - Phase 5 water extraction →
05.Water_Extraction/ - Phase 6 sunglint QA/correction →
06.Sunglint_QA/(optional enhancement) - Phase 7 adjacency correction (RAdCor) →
07.Adjacency_Correction/RAdCor/(optional enhancement) - Phase 7c T-Mart AEC →
07.Adjacency_Correction/tmart/(parallel method, env tmart, standalone) - Phase 8a/8b harmonize/resample → reads Phase 07 RAdCor if available, else Phase 06 glintcorr, else Phase 03
- 08a: Landsat →
Rrs_stack_30m.tif - 08b: Sentinel-2 →
Rrs_stack_10m.tif(10m; native Green/Red resolution)
- 08a: Landsat →
- Phase 9a/9b model build → reads Phase 08 stacks + in-situ CSV; trains 17 models
- Feature vector: [green, red, nir, ndti, red_nir_ratio, red_green_ratio]
- ML and empirical models trained in log1p(TSS) space by default (
log_transform_ml=true,log_transform_empirical=true) - Training filter:
min_water_pixels(default 10) rejects scenes with insufficient water at station - Sampling:
sampling_method=transectextracts Rrs along a cross-section perpendicular to flow;n_adjacent_sectionsadds parallel sections;max_cross_section_half_width_mcaps transect width
- Phase 10 estimation/time series → reads Phase 08 stacks + sensor-specific Phase 09 model
Completion Checks
- Phase 1:
3.Pre-Processing/01.Clip_Raw/QA/*.csvexists and has rows. - Phase 3:
3.Pre-Processing/03.Atmospheric_Correction/QA/*.csvexists and statuses are not failing. - Phase 5:
3.Pre-Processing/05.Water_Extraction/QA/*.csvexists and includeswater_coverage_pct. - Phase 8a/8b:
3.Pre-Processing/08.Resampled/QA/resample_harmonize_{landsat,sentinel2}.csvexist; checkinput_sourcecolumn. - Phase 9a/9b:
4.Results/TSS_Models/{Landsat,Sentinel2}/model_comparison.csvhas 17 rows;best_model_meta.jsonstatus=ok. - Phase 10:
4.Results/Time_Series/TSS_stations_timeseries.csvexists withtss_estimated_g_m3values. - Station extraction uses only rows with valid lon/lat in
stations_config.json.
Phase 08 Input Priority (important)
08a/08b reads bands via select_input_scene() with this priority:
07.Adjacency_Correction/<sensor>/RAdCor/<station>/<scene>/— highest quality06.Sunglint_QA/<sensor>/<station>/<scene>/— glint-corrected (uses*_glintcorr.tif)03.Atmospheric_Correction/<sensor>/<station>/<scene>/— fallback
The input_source field in QA CSV records which was used per scene.
Phase 09 Model Library (_utils/tss_models.py)
build_all_models(n_starts, target_r2, log_transform_ml, log_transform_empirical)→ 17 models- Group 1 — Empirical (5): PowerLaw, Exponential, Poly2Red, Poly2NDTI, BandRatioPower
- Group 2 — Semi-empirical (4): Nechad2010 (C_N fixed=0.1724 sr), Nechad2010Full, Doxaran2002, HanMiller
- Group 3 — OLS log-linear (3): PowerLawOLS, ExponentialOLS, BandRatioPowerOLS
- Group 4 — ML (5): LinearRegression, RidgeCV, RandomForest(n=300), GradientBoosting(n=200,depth=4), SVR_RBF
- Fitting methods (no AutoTuner):
_fit_exp_ab_grid: B-grid scan + closed-form A = Σ(y·exp(Bx))/Σ(exp(Bx)²)_fit_power_ols: log-log OLS via lstsq_fit_poly2_ols: direct lstsq for 2nd-degree polynomial_fit_nechad_grid: C-grid scan + closed-form A for each C
LogSpaceWrapper: wraps ML and empirical models to fit/predict in log1p(TSS) space (controlled bylog_transform_mlandlog_transform_empirical)ModelEvaluator: chronological 70/30 temporal split (first 70% of dates = train, last 30% = test), metrics = RMSE/MAE/R²/bias/MAPE_pctevaluate_with_predictions(): returns(metrics, all_preds[n], split_labels[n])count_water_pixels(water_mask_path, lon, lat): validates scene has enough water near station- Transect sampling:
sample_transect_stack(stack, water_mask, lon, lat, flow_az_deg, half_width_m, pixel_spacing_m, n_adjacent)→ median Rrs across the cross-sectionnearest_flow_bearing(lon, lat, centerline_gdf): derives flow azimuth from nearest river centreline segment- Water-mask filtered: only pixels with
water_mask > 0contribute to the median
- All models picklable and sklearn-compatible (
fit/predictinterface)
Training Data Quality Notes
- Flood season (Aug–Sep) scenes may have anomalously low Rrs despite high TSS (cloud remnants, overbank flooding mixing terrestrial pixels with station measurements).
- Use
min_water_pixels ≥ 10in phase9 config to filter such scenes. - TSS distribution is log-normal (right-skewed) → ML models should train in log1p space.
- r(red, TSS) < 0.1 is a red flag: check water pixel count per scene in
training_dataset.csv.
References
- Progress tracker:
F:/Loc/Sat/.github/skills/tss-pipeline/references/progress.md - Phase 05b technical note:
F:/Loc/Sat/.github/skills/tss-pipeline/references/phase-05b-sentinel2-water-extraction.md
Continuation
Always check the progress tracker before writing new code and append technical behavior changes to references when algorithm/CLI/QA fields change.