Quick Start¶
This page shows the traditional first run for CellPainting-Claw.
This page walks through one standard classical profiling run from input data to final profile outputs.
The run shown here does five concrete things in order:
check the configured data source and stage a small demo download
run CellProfiler extraction so the measurement tables are available
merge those tables into one single-cell table
use pycytominer to aggregate, annotate, normalize, and feature-select the classical profiles
write summary tables and PCA views for quick inspection
DeepProfiler is not part of this page.
All output cells below are real recorded outputs from the current repository demo assets and current runtime.
Install¶
From the repository root:
conda env create -f environment/cellpainting-claw.environment.yml
conda activate cellpainting-claw
pip install -e .[data-access]
Repository Root¶
Run the remaining commands on this page from the repository root.
cd /path/to/CellPainting-Claw
Run Variables¶
Run these three lines once in your terminal from the repository root before the commands below.
CONFIGpoints to the demo project config fileDATA_ROOTis the directory for the data-access demo outputsRUN_ROOTis the directory for the classical profiling demo outputs
The later commands reuse these names as $CONFIG, $DATA_ROOT, and $RUN_ROOT. If you prefer, you can replace them with the full paths directly in each command.
CONFIG=configs/project_config.demo.json
DATA_ROOT=demo/workspace/outputs/quick_start_data
RUN_ROOT=demo/workspace/outputs/quick_start_classical
Prepare Input Data¶
This section prepares input data before classical profiling starts.
If your images are still in the Cell Painting Gallery, the usual sequence is:
check which dataset and source are configured
build a download plan for the subset you want
download that subset into local storage
If your input files are already present locally, you can skip this section and move to the CellProfiler steps below.
The commands below were run against the live Cell Painting Gallery with the demo config.
Inspect Configured Sources¶
cellpainting-skills run \
--config "$CONFIG" \
--skill data-inspect-availability \
--output-dir "$DATA_ROOT/01_inspect"
Default dataset: cpg0016-jump
Gallery datasets discovered: 42
Sources discovered under the default dataset: 14
Required data-access packages were available in the recorded runtime
Files written in $DATA_ROOT/01_inspect:
data_access_summary.jsonpipeline_skill_manifest.json
Download Plan¶
cellpainting-skills run \
--config "$CONFIG" \
--skill data-plan-download \
--dataset-id cpg0016-jump \
--source-id source_4 \
--max-files 4 \
--output-dir "$DATA_ROOT/02_plan"
Resolved dataset: cpg0016-jump
Resolved source: source_4
Resolved Gallery prefix: cpg0016-jump/source_4/
Planned steps: 1
File cap in this example plan: 4
Files written in $DATA_ROOT/02_plan:
download_plan.jsonpipeline_skill_manifest.json
Download A Small Local Input Slice¶
cellpainting-skills run \
--config "$CONFIG" \
--skill data-download \
--dataset-id cpg0016-jump \
--source-id source_4 \
--subprefix workspace/load_data_csv/2021_04_26_Batch1/BR00117035 \
--output-dir "$DATA_ROOT/03_download_small"
Downloaded prefix: cpg0016-jump/source_4/workspace/load_data_csv/2021_04_26_Batch1/BR00117035/
Matched files: 2
Downloaded files: 2
Downloaded filenames: load_data.csv, load_data_with_illum.csv
Files written in $DATA_ROOT/03_download_small:
downloads/download_manifest.jsondownloads/load_data.csvdownloads/load_data_with_illum.csv
This bounded download only demonstrates how remote Gallery data can be staged locally. The classical profiling steps below continue from the repository demo assets, which already include a minimal CellProfiler result set.
Measurement Tables¶
This step makes the CellProfiler measurement tables available for the rest of the classical profiling path.
cellpainting-skills run \
--config "$CONFIG" \
--skill cp-extract-measurements \
--output-dir "$RUN_ROOT/01_measurements"
Demo mode: bundled measurement tables were reused
Exposed tables: Image.csv, Cells.csv, Nuclei.csv
Public skill entrypoint: cp-extract-measurements
Files available in this step:
Image.csvCells.csvNuclei.csvpipeline_skill_manifest.json
In the public demo checkout, these tables come from the bundled demo backend.
In this public demo checkout, the original profiling backend script is not packaged. For the recorded demo run, this skill therefore reuses the bundled measurement tables instead of rerunning CellProfiler. In a user-owned workspace, the same skill remains the public entrypoint for the measurement stage.
Single-Cell Table¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cp-build-single-cell-table \
--image-csv-path demo/backend/profiling_backend/outputs/cellprofiler/Image.csv \
--object-table-path demo/backend/profiling_backend/outputs/cellprofiler/Cells.csv \
--object-table Cells \
--output-dir "$RUN_ROOT/02_single_cell"
Single-cell rows written: 4
Columns written: 16
Object table used for the merge: Cells
Files written in $RUN_ROOT/02_single_cell:
single_cell.csv.gzpipeline_skill_manifest.json
This step merges the CellProfiler tables into one single-cell feature table. The pycytominer steps below use that merged table as their input.
Classical Profiles¶
Aggregate Profiles¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cyto-aggregate-profiles \
--single-cell-path "$RUN_ROOT/02_single_cell/single_cell.csv.gz" \
--output-dir "$RUN_ROOT/03_cyto_aggregate"
Aggregated profile rows: 2
Aggregated profile columns: 14
Files written in $RUN_ROOT/03_cyto_aggregate:
pycytominer/aggregated.parquetpipeline_skill_manifest.json
Annotate Profiles¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cyto-annotate-profiles \
--aggregated-path "$RUN_ROOT/03_cyto_aggregate/pycytominer/aggregated.parquet" \
--output-dir "$RUN_ROOT/04_cyto_annotate"
Annotated profile rows: 2
Annotated profile columns: 17
Files written in $RUN_ROOT/04_cyto_annotate:
pycytominer/annotated.parquetpipeline_skill_manifest.json
Normalize Profiles¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cyto-normalize-profiles \
--annotated-path "$RUN_ROOT/04_cyto_annotate/pycytominer/annotated.parquet" \
--output-dir "$RUN_ROOT/05_cyto_normalize"
Normalized profile rows: 2
Normalized profile columns: 17
Files written in $RUN_ROOT/05_cyto_normalize:
pycytominer/normalized.parquetpipeline_skill_manifest.json
Select Profile Features¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cyto-select-profile-features \
--normalized-path "$RUN_ROOT/05_cyto_normalize/pycytominer/normalized.parquet" \
--output-dir "$RUN_ROOT/06_cyto_select"
Feature-selected profile rows: 2
Feature-selected profile columns: 12
Files written in $RUN_ROOT/06_cyto_select:
pycytominer/feature_selected.parquetpipeline_skill_manifest.json
These four pycytominer stages turn the merged single-cell measurements into a cleaned well-level profile table: first aggregate, then annotate, normalize, and finally select features.
Summary Outputs¶
cellpainting-skills run \
--config "$CONFIG" \
--skill cyto-summarize-classical-profiles \
--feature-selected-path "$RUN_ROOT/06_cyto_select/pycytominer/feature_selected.parquet" \
--output-dir "$RUN_ROOT/07_cyto_summary"
Summary rows represented: 2
Features retained at this stage: 6
Top variable features reported: 6
PCA components written: 2
Files written in $RUN_ROOT/07_cyto_summary:
profile_summary.jsonwell_metadata_summary.csvtop_variable_features.csvpca_coordinates.csvpca_plot.pngpipeline_skill_manifest.json
This final step turns the processed profile table into files that are easier to inspect directly: a compact summary, metadata summaries, top-variable features, and PCA outputs.
Result Files¶
After this Quick Start, the most useful files to inspect are:
data_access_summary.jsonfor the configured source inventorydownload_plan.jsonfor the resolved Gallery requestsingle_cell.csv.gzfor the merged single-cell measurements tableaggregated.parquet,annotated.parquet,normalized.parquet, andfeature_selected.parquetfor the pycytominer stagesprofile_summary.json,well_metadata_summary.csv,top_variable_features.csv, andpca_plot.pngfor the final review layer
Next Pages¶
Continue with these pages after the first run:
CellPainting-Skills for the full skill catalog
Command-Line Interface for direct CLI usage
OpenClaw for agent-mediated use of the same skills