Welcome to pythologist’s documentation!

Readme File

pythologist

Read and analyze cell image data.

Intro

Pythologist 1) reads exports from InForm software or other sources into a common storage format, and 2) extracts basic analysis features from cell image data. This software is generally intended to be run from a jupyter notebook and provides hooks into the image data so that the user can have the flexability to execute analyses they design or find in the primary literature.

List of image analysis publications

Pythologist is based on **IrisSpatialFeatures** (C.D. Carey, ,D. Gusenleitner, M. Lipshitz, et al. Blood. 2017) https://doi.org/10.1182/blood-2017-03-770719, and is implemented in the python programming language.

Features Pythologist add are:

  • An common CellProjectGeneric storage class, and classical inheritance conventions to organize the importation of different data types.

  • A mutable CellDataFrame class that can be used for slicing, and combining projects.

  • The ability to add binary features to cells based on cell-cell contacts or cell proximity.

  • Customizable images based on the cell segmentation or heatmaps spaninng the cartesian coordinates.

  • Specify cell populations through a SubsetLogic syntax for quick selection of mutually exclusive phenotypes or binary features

  • A set of Quality Check functions to identify potential issues in imported data.

Documentation

Primary Software

  • pythologist This software package uses a CellDataFrame class, an extension of a Pandas DataFrame to modify data and execute analyses [Read the Docs] [source]

    • pythologist-schemas This submodule documents/defines the formats of inputs and outputs expected in this pipeline. [source]

    • pythologist-reader This submodule facillitates reading platform-specific data into a harmonized format. [Read the Docs] [source]

    • pythologist-test-images This submodule has some example data [source]

    • pythologist-image-utilities This submodule has helper functions to work with images [Read the Docs] [source]

Additional Analytics

  • good-neighbors This package facilitates the analysis of cellular data based on their proximal “cellular neighborhoods” [Read the Docs] [source]

about Submodules

This primary module pythologist is comprised of submodules.

All of these can be cloned at once via https with the command:

$ git clone --recurse-submodules https://github.com/jason-weirather/pythologist.git

or via ssh

$ git clone --recurse-submodules git@github.com:jason-weirather/pythologist.git

Submodules will be in the libs/ directory. For development purposes you should

  1. checkout and pull the master branch of each of these submodules

  2. install each of these submodules as editable via pip install -e .

  3. install the main pythologist as editable the same way pip install -e .

There is probably a more elegant way to use setuptools to assist in this process that I’m not doing here.

Quickstart

To start a jupyter lab notebook with the required software as your user in your current drectory you can use the following command

docker run --rm -p 8888:8888 --user $(id -u):$(id -g) -v $(pwd):/work vacation/pythologist:latest

This will start jupyter lab on port 8888 as your user and group.

Any of the test data examples should work fine in this environment.

Installation

Install by pip

$ pip install pythologist

Common tasks

The assumption here is that the exports are grouped so that sample folders contain one or more image exports, and that sample name can be inferred from the last folder name.

from pythologist_test_images import TestImages
from pythologist_reader.formats.inform.sets import CellProjectInForm
import matplotlib.pyplot as plt

# Get the path of the test dataset
path = TestImages().raw('IrisSpatialFeatures')
# Create the storage opbject where the project will be saved
cpi = CellProjectInForm('pythologist.h5',mode='w')
# Read the project data
cpi.read_path(path,require=False,verbose=True,microns_per_pixel=0.496,sample_name_index=-1)
# Display one of the cell map images
for f in cpi.frame_iter():
    break
print(f.frame_name)
plt.imshow(f.cell_map_image(),origin='upper')
plt.show()

MEL2_7

MEL2_7_cell_map

Another format supported for a project import is one with a custom tumor and invasive margin definition. Similar to above, the project is organized into sample folders, and each image within each sample folder has a tif file defining the tumor and invasive margin. These come in the form of a <image name prefix>_Tumor.tif and <image name prefix>_Invasive_Margin.tif for each image. The _Tumor.tif is an area filled in where the tumor is, and transparent elsewhere. The _Invasive_Margin.tif is a drawn line of a known width. steps is used to grow the margin out that many pixels in each direction to establish an invasive margin region. Here we also rename some markers during read-in to clean up the syntax of thresholding on binary features.

from pythologist_test_images import TestImages
from pythologist_reader.formats.inform.custom import CellProjectInFormLineArea

# Get the path of the test dataset
path = TestImages().raw('IrisSpatialFeatures')
# Specify where the data read-in will be stored as an h5 object
cpi = CellProjectInFormLineArea('test.h5',mode='w')
# Read in the data (gets stored on the fly into the h5 object)
cpi.read_path(path,
              sample_name_index=-1,
              verbose=True,
              steps=76,
              project_name='IrisSpatialFeatures',
              microns_per_pixel=0.496)
for f in cpi.frame_iter():
    break
print(f.frame_name)
print('hand drawn margin')
plt.imshow(f.get_image(f.get_data('custom_images').\
    set_index('custom_label').loc['Drawn','image_id']),origin='upper')
plt.show()
print('hand drawn tumor area')
plt.imshow(f.get_image(f.get_data('custom_images').\
    set_index('custom_label').loc['Area','image_id']),origin='upper')
plt.show()
print('Mutually exclusive Margin, Tumor, and Stroma')
plt.imshow(f.get_image(f.get_data('regions').\
    set_index('region_label').loc['Margin','image_id']),origin='upper')
plt.show()
plt.imshow(f.get_image(f.get_data('regions').\
    set_index('region_label').loc['Tumor','image_id']),origin='upper')
plt.show()
plt.imshow(f.get_image(f.get_data('regions').\
    set_index('region_label').loc['Stroma','image_id']),origin='upper')
plt.show()

MEL2_2

hand drawn margin

MEL2_2_drawn_line

hand drawn tumor area

MEL2_2_drawn_line

Mutually exclusive Margin, Tumor, and Stroma

MEL2_2_margin MEL2_2_tumor MEL2_2_stroma

Here we will use the mask, but not expand or subtract from it.

from pythologist_test_images import TestImages
from pythologist_reader.formats.inform.custom import CellProjectInFormCustomMask
import matplotlib.pyplot as plt
path = TestImages().raw('IrisSpatialFeatures')
cpi = CellProjectInFormCustomMask('test.h5',mode='w')
cpi.read_path(path,
              microns_per_pixel=0.496,
              sample_name_index=-1,
              verbose=True,
              custom_mask_name='Tumor',
              other_mask_name='Not-Tumor')
for f in cpi.frame_iter():
    rs = f.get_data('regions').set_index('region_label')
    for r in rs.index:
        print(r)
        plt.imshow(f.get_image(rs.loc[r]['image_id']),origin='upper')
        plt.show()
    break

MEL2_2

Tumor

MEL2_2_tumor

Not-Tumor

MEL2_2_not_tumor

Check general status of the CellDataFrame

cdf = cpi.cdf
cdf.db = cpi
cdf.qc(verbose=True).print_results()

prints the following QC metrics to stdout

==========
Check microns per pixel attribute
PASS
Microns per pixel is 0.496
==========
Check storage object is set
PASS
h5 object is set
==========
Is there a 1:1 correspondence between sample_name and sample_id?
PASS
Good concordance.
Issue count: 0/2
==========
Is there a 1:1 correspondence between frame_name and frame_id?
PASS
Good concordance.
Issue count: 0/4
==========
Is there a 1:1 correspondence between project_name and project_id?
PASS
Good concordance.
Issue count: 0/1
==========
Is the same frame name present in multiple samples?
PASS
frame_name's are all in their own samples
Issue count: 0/4
==========
Are the same phenotypes listed and following rules for mutual exclusion?
PASS
phenotype_calls and phenotype_label follows expected rules
==========
Are the same phenotypes included on all images?
PASS
Consistent phenotypes
Issue count: 0/4
==========
Are the same scored names included on all images?
PASS
Consistent scored_names
Issue count: 0/4
==========
Are the same regions represented the same with an image and across images?
PASS
Consistent regions
Issue count: 0/5
==========
Are the same regions listed matching a valid region_label
PASS
regions and region_label follows expected rules
==========
Do we have any region sizes so small they should consider being excluded?
WARNING
[
    "Very small non-zero regions are included in the data['IrisSpatialFeatures', 'MEL2', 'MEL2_7', {'Margin': 495640, 'Tumor': 947369, 'Stroma': 116}]"
]
Issue count: 1/2

The cell phenotypes set prior to calling cartesian are the phenotypes available to plot.

from pythologist_test_images import TestImages
from plotnine import *
proj = TestImages().project('IrisSpatialFeatures')
cdf = TestImages().celldataframe('IrisSpatialFeatures')
cdf.db = proj
cart = cdf.cartesian(verbose=True,step_pixels=50,max_distance_pixels=75)
df,cols,rngtop = cart.rgb_dataframe(red='CD8+',green='SOX10+')
shape = cdf.iloc[0]['frame_shape']
(ggplot(df,aes(x='frame_x',y='frame_y',fill='color_str'))
 + geom_point(shape='h',size=4.5,color='#777777',stroke=0.2)
 + geom_vline(xintercept=-1,color="#555555")
 + geom_vline(xintercept=shape[1],color="#555555")
 + geom_hline(yintercept=-1,color="#555555")
 + geom_hline(yintercept=shape[0],color="#555555")
 + facet_wrap('frame_name')
 + scale_fill_manual(cols,guide=False)
 + theme_bw()
 + theme(figure_size=(8,8))
 + theme(aspect_ratio=shape[0]/shape[1])
 + scale_y_reverse()
)
Density Example
from pythologist_test_images import TestImages
from plotnine import *
proj = TestImages().project('IrisSpatialFeatures')
cdf = TestImages().celldataframe('IrisSpatialFeatures')
cdf.db = proj
ch = cdf.db.qc().channel_histograms()
sub = ch.loc[(~ch['threshold_value'].isna())&(ch['channel_label']=='PDL1')]
(ggplot(sub,aes(x='bins',y='counts'))
 + geom_bar(stat='identity')
 + facet_wrap('frame_name')
 + geom_vline(aes(xintercept='threshold_value'),color='red')
 + theme_bw()
 + ggtitle('Thresholding of PDL1\ngiven image pixel intensities')
)

The original component images were not available for IrisSpatialFeatures example, so pixel intensities are simulated and don’t necessarily match the those which would have been used to set the original threshold values.

Histogram Example
from pythologist_test_images import TestImages
from pythologist_reader.formats.inform.custom import CellProjectInFormCustomMask
from pythologist import SubsetLogic as SL
cpi = TestImages().project('IrisSpatialFeatures')
cdf = cpi.cdf
cdf.db = cpi
sub = cdf.loc[cdf['frame_name']=='MEL2_7'].dropna()
cont = sub.contacts().threshold('CD8+','CD8+/contact').contacts().threshold('SOX10+','SOX10+/contact')
cont = cont.threshold('CD8+','SOX10+/contact',
                      positive_label='CD8+ contact',
                      negative_label='CD8+').\
    threshold('SOX10+','CD8+/contact',
              positive_label='SOX10+ contact',
              negative_label='SOX10+')
schema = [
    {'subset_logic':SL(phenotypes=['OTHER']),
     'edge_color':(50,50,50,255),
     'watershed_steps':0,
     'fill_color':(0,0,0,255)
    },
    {'subset_logic':SL(phenotypes=['SOX10+']),
     'edge_color':(166,206,227,255),
     'watershed_steps':0,
     'fill_color':(0,0,0,0)
    },
    {'subset_logic':SL(phenotypes=['CD8+']),
     'edge_color':(253,191,111,255),
     'watershed_steps':0,
     'fill_color':(0,0,0,0)
    },
    {'subset_logic':SL(phenotypes=['CD8+ contact']),
     'edge_color':(253,191,111,255),
     'watershed_steps':0,
     'fill_color':(255,127,0,255)
    },
    {'subset_logic':SL(phenotypes=['SOX10+ contact']),
     'edge_color':(166,206,227,255),
     'watershed_steps':0,
     'fill_color':(31,120,180,255)
    }
]
sio = cont.segmentation_images().build_segmentation_image(schema,background=(0,0,0,255))
sio.write_to_path('test_edges',overwrite=True)

MEL2_7

Visualize Contacts

Image is zoomed-in and cropped to show the contours better.

This happens frequently because current InForm exports only permit two features to be scored per export

merged,fail = cdf1.merge_scores(cdf2,on=['sample_name','frame_name','x','y'])
cdf.scored_names

[‘PD1’, ‘PDL1’]

cdf.phenotypes

[‘CD8+’, ‘OTHER’, ‘SOX10+’]

cdf.regions

[‘Margin’, ‘Stroma’, ‘Tumor’]

collapsed = cdf.collapse_phenotypes(['CD8+','OTHER'],'non-Tumor')
collapsed.phenotypes

[‘SOX10+’, ‘non-Tumor’]

Rename TUMOR to Tumor

renamed = cdf.rename_region('TUMOR','Tumor')
renamed = cdf.rename_scored_calls({'PDL1 (Opal 520)':'PDL1'})

Make CYTOK into CYTOK PDL1+ and CYTOK PDL1-

raw_thresh = raw.threshold('CYTOK','PDL1')
CD68_CD163 = raw.threshold('CD68','CD163').\
    threshold('CD68 CD163+','PDL1').\
    threshold('CD68 CD163-','PDL1')

generate counts and fractions of the current phenotypes and export them to a csv

cdf.counts().frame_counts().to_csv('my_frame_counts.csv')

generate counts and fractions of the current phenotypes and export them to a csv

cdf.counts().sample_counts().to_csv('my_sample_counts.csv')

The follow command creates a new CellDataFrame that has an additional binary feature representative of the contact with ‘T cell’ phenotype cells.

cdf = cdf.contacts().threshold('T cell')

The follow command creates a new CellDataFrame that has an additional binary feature representative of being inside or outisde 75 microns of ‘T cell’ phenotype cells.

cdf = cdf.nearestneighbors().threshold('T cell','T cell/within 75um',distance_um=75)

Check outputs against IrisSpatialFeatures outputs

To ensure we are generating expected outs we can check against the outputs of IrisSpatialFeatures [github].

Modules

class pythologist.CellDataFrame(*args, **kw)[source]

The CellDataFrame class is an extension of a pandas.DataFrame with per-cell rows that have region, binary calls, mutually exclusive phenotypes, cell locations, and cell-cell contact.

Params:

microns_per_pixel (float): conversion factor that gets saved along with the dataframe once its set. (20x vectra is a 0.496) db (CellProject): a storage class that has all the image and mask data

cartesian(subsets=None, step_pixels=100, max_distance_pixels=150, *args, **kwargs)[source]

Return a class that can be used to create honeycomb plots

Parameters
  • subsets (list) – list of SubsetLogic objects

  • step_pixels (int) – distance between hexagons

  • max_distance_pixels (int) – the distance from each point by which to caclulate the quanitty of the phenotype for that area

Returns

returns a class that holds the layout of the points to plot.

Return type

Cartesian

collapse_phenotypes(input_phenotype_labels, output_phenotype_label, verbose=True)[source]

Rename one or more input phenotypes to a single output phenotype

Parameters
  • input_phenotype_labels (list) – A str name or list of names to combine

  • output_phenotype_label (list) – A str name to change the phenotype names to

  • verbose (bool) – output more details

Returns

The CellDataFrame modified.

Return type

CellDataFrame

combine_regions(input_region_labels, output_region_label, verbose=True)[source]

Combine/rename one or more input regions to a single output region

Parameters
  • input_region_labels (list) – A str name or list of names to combine

  • output_region_label (list) – A str name to change the phenotype names to

  • verbose (bool) – output more details

Returns

The CellDataFrame modified.

Return type

CellDataFrame

classmethod concat(array_like)[source]

Concatonate multiple CellDataFrames

throws an error if the microns_per_pixel is not uniform across the frames

Parameters

array_like (list) – a list of CellDataFrames with 1 or more CellDataFrames

Returns

CellDataFrame

contacts(*args, **kwargs)[source]

Use assess the cell-to-cell contacts recorded in the celldataframe

Returns

returns a class that holds cell-to-cell contact information for whatever phenotypes were in the CellDataFrame before execution.

Return type

Contacts

convert_cascading_scores_to_mutually_exclusive_ordinal_binary(cascading_scored_calls, ordinal_labels)[source]

If you have a cascade of scoring stored as binary calls, you can convert these to mutuallye exclusive binary calls for ordinal labels.

Example is you have thresholds for 0/1, 1/2, and 2/3, you can convert these thresholds to mutually exclusive +/- for 0,1,2,3

Parameters
  • cascading_scored_calls (list) – an ordered from lowest thresholds to greatest thresholds list of thresholds in scored_names

  • ordinal_labels (list) – the list of ordinal labels to split the phenotype label into

Returns

CellDataFrame

counts(*args, **kwargs)[source]

Return a class that can be used to access count densities

Parameters
  • measured_regions (pandas.DataFrame) – Dataframe of regions that are being measured (defaults to all the regions)

  • measured_phenotypes (list) – List of phenotypes present (defaults to all the phenotypes)

  • minimum_region_size_pixels (int) – Minimum region size to calculate counts on in pixels (Default: 1)

Returns

returns a class that holds the counts.

Return type

Counts

property db

Assign to this or read from this, the CellProject storage object

drop_scored_calls(names)[source]

Take a name or list of scored call names and drop those from the scored calls

Parameters

names (list) – list of names to drop or a single string name to drop

Returns

The CellDataFrame modified.

Return type

CellDataFrame

fill_phenotype_calls(phenotypes=None, inplace=False)[source]

Set the phenotype_calls according to the phenotype names

fill_phenotype_label(inplace=False)[source]

Set the phenotype_label column according to our rules for mutual exclusion

property frame_columns

Returns a list of fields suitable for identifying the unique image frames

get_measured_regions()[source]
Returns

Output a dataframe with regions and region sizes

Return type

pandas.DataFrame

get_valid_cell_indecies()[source]

Return a dataframe of images present with ‘valid’ being a list of cell indecies that can be included

is_uniform(verbose=True)[source]

Check to make sure phenotype calls, or scored calls are consistent across all images / samples

merge_scores(df_addition, reference_markers='all', addition_markers='all', on=['project_name', 'sample_name', 'frame_name', 'cell_index'])[source]

Combine CellDataFrames that differ by score composition

Parameters
  • df_addition (CellDataFrame) – The CellDataFrame to merge scores in from

  • reference_markers (list) – which scored call names to keep in the this object (default: all)

  • addition_markers (list) – which scored call names to merge in (default: all)

  • on (list) – the features to merge cells on

Returns

returns a passing CellDataFrame where merge criteria were met and a fail CellDataFrame where merge criteria were not met.

Return type

CellDataFrame,CellDataFrame

property microns_per_pixel

Read or store the micron’s per pixel (float) value by reading or asigning to this

nearestneighbors(*args, **kwargs)[source]

Use the segmented images to create per-image graphics

Parameters
  • verbose (bool) – output more details if true

  • measured_regions (pandas.DataFrame) – explicitly list the measured images and regions

  • measured_phenotypes (list) – explicitly list the phenotypes present

Returns

returns a class that holds nearest neighbor information for whatever phenotypes were in the CellDataFrame before execution. This class is suitable for nearest neighbor and proximity operations.

Return type

NearestNeighbors

permute_phenotype_labels(phenotype_labels=None, random_state=None, group_strategy=['project_name', 'project_id', 'sample_name', 'sample_id', 'frame_name', 'frame_id'])[source]

Shuffle phenotype labels. Defaults to shuffleling all labels within a frame. Adjust this by modifying group_strategy.

Parameters
  • phenotype_labels (list) – a list of phenotype_labels to shuffle amongst eachother if None shuffle all

  • random_state (int or numpy random state) – pass to the pandas shuffle function

  • group_strategy (list) – variables to group by

Returns

CellDataFrame

property phenotypes

Return the list of phenotypes present

phenotypes_to_regions(*args, **kwargs)[source]

Create a new Project where regions are replaced to be based on regions defined as phenotypes

Parameters
  • path (str) – Location to store a new hdf5 file containing a database update with new region images

  • gaussian_sigma (float) – the sigma parameter to the gaussian_filter function that says how much to ‘blur’

  • overwrite (bool) – if True allows you to overwrite the path default (False)

  • unset_label (str) – A label to give regions that are unaccounted for

  • project_name (str) – the project name

Returns

The new cell project CellDataFrame: The updated cell project

Return type

CellProject

phenotypes_to_scored(phenotypes=None, overwrite=False)[source]

Add mutually exclusive phenotypes to the scored calls

Parameters
  • phenotypes (list) – a list of phenotypes to add to scored calls. if none or not set, add them all

  • overwrite (bool) – if True allow the overwrite of a phenotype, if False, the phenotype must not exist in the scored calls

Returns

CellDataFrame

property project_columns

Returns a list of fields suitable for identifying the unique projects

prune_neighbors()[source]

If the CellDataFrame has been subsetted, some of the cell-cell contacts may no longer be part of the the dataset. This prunes those no-longer existant connections.

Returns

A CellDataFrame with only valid cell-cell contacts

Return type

CellDataFrame

qc(*args, **kwargs)[source]

Return a class that can be used to access QC reports

Returns

returns a class that can be used to interrogate the QC.

Return type

QC

classmethod read_hdf(path, key=None)[source]

Read a CellDataFrame from an hdf5 file.

Parameters
  • path (str) – the path to read from

  • key (str) – the name of the location to read from

Returns

CellDataFrame

property regions

Return the list of region names

regions_to_scored(regions=[])[source]

Covert the region calls to scored_calls

Args: regions (list): a list of regions to use (default empty list will use all regions)

rename_phenotype(*args, **kwargs)[source]

simple alias for collapse phenotypes

rename_region(*args, **kwargs)[source]

simple alias for combine phenotypes

rename_scored_calls(change)[source]

Change the names of scored call names, input dictionary change with {<current name>:<new name>} format, new name must not already exist

Parameters

change (dict) – a dictionary of current name keys and new name values

Returns

The CellDataFrame modified.

Return type

CellDataFrame

property sample_columns

Returns a list of fields suitable for identifying the unique samples

property scored_names

Return the list of binary feature names

scored_to_phenotype(phenotypes)[source]

Convert binary pehnotypes to mutually exclusive phenotypes. If none of the phenotypes are set, then phenotype_label becomes nan If any of the phenotypes are multiply set then it throws a fatal error.

Parameters

phenotypes (list) – a list of scored_names to convert to phenotypes

Returns

CellDataFrame

segmentation_images(*args, **kwargs)[source]

Use the segmented images to create per-image graphics

Parameters

verbose (bool) – output more details if true

Returns

returns a class used to construct the image graphics

Return type

SegmentationImages

serialize()[source]

Convert the data to one that can be saved in h5 structures

Returns

like a cell data frame but serialized. columns

Return type

pandas.DataFrame

subset(logic, update=False)[source]

subset create a specific phenotype based on a logic, logic is a ‘SubsetLogic’ class, take union of all the phenotypes listed. If none are listed use all phenotypes. take the intersection of all the scored calls.

Parameters
  • logic (SubsetLogic) – A subsetlogic object to slice on

  • update (bool) – (default False) change the name of the phenotype according to the label in the subset logic

Returns

The CellDataFrame modified.

Return type

CellDataFrame

threshold(phenotype, scored_name, positive_label=None, negative_label=None)[source]

Split a phenotype on a scored_call and if no label is specified use the format ‘<phenotype> <scored_call><+/->’ to specify a label give the positive and negative label

Parameters
  • phenotype (str) – name of the phenotype to threshold

  • scored_name (str) – scored call name to apply value from

  • positive_label (str) – name to apply for positive lable (default: <phenotype> <scored_call>+)

  • negative_label (str) – name to apply for negative lable (default: <phenotype> <scored_call>-)

Returns

The CellDataFrame modified.

Return type

CellDataFrame

threshold_on_mutually_exclusive_ordinal_labels(phenotype_label, ordinal_labels)[source]

If mutually exclusive ordinal labels are present among the scoring, you can threshold a phenotype on these labels.

Parameters
  • phenotype_label (str) – a phenotype_label split based on the ordinal labels

  • ordinal_labels (list) – the list of ordinal labels to split the phenotype label on

Returns

CellDataFrame

to_hdf(path, key, mode='a')[source]

Save the CellDataFrame to an hdf5 file.

Parameters
  • path (str) – the path to save to

  • key (str) – the name of the location to save it to

  • mode (str) – write mode

zero_fill_missing_phenotypes()[source]

Fill in missing phenotypes and scored types by listing any missing data as negative

Returns

The CellDataFrame modified.

Return type

CellDataFrame

zero_fill_missing_scores()[source]

Fill in missing phenotypes and scored types by listing any missing data as negative

Returns

The CellDataFrame modified.

Return type

CellDataFrame

class pythologist.interface.SegmentationImages(*args, **kwargs)[source]

Class suitable for generating image outputs

build_segmentation_image(schema, background=(0, 0, 0, 0))[source]

Put together an image. Defined by a list of layers with RGBA colors

Make the schema example

schema = [
{‘subset_logic’:SL(phenotypes=[‘SOX10+’]),
‘edge_color’:(31, 31, 46,255),
‘watershed_steps’:0,
‘fill_color’:(51, 51, 77,255)
},
{‘subset_logic’:SL(phenotypes=[‘CD8+’],scored_calls={‘PD1’:’+’}),
‘edge_color’:(255,0,0,255),
‘watershed_steps’:1,
‘fill_color’:(0,0,0,255)
},
{‘subset_logic’:SL(phenotypes=[‘CD8+’],scored_calls={‘PD1’:’-‘}),
‘edge_color’:(255,0,255,255),
‘watershed_steps’:1,
‘fill_color’:(0,0,255,255)
}
]
imgs = imageaccess.build_segmentation_image(schema,background=(0,0,0,255))
Parameters
  • schema (list) – a list of layers (see example above)

  • background (tuple) – a color RGBA 0-255 tuple for the. background color

Returns

an output suitable for writing images

Return type

SegmentationImageOutput

class pythologist.interface.SegmentationImageOutput(*args, **kw)[source]

The Segmentation Image Output class

write_to_path(path, suffix='', format='png', overwrite=False)[source]

Output the data the dataframe’s ‘image’ column to a directory structured by project->sample and named by frame

Parameters
  • path (str) – Where to write the directory of images

  • suffix (str) – for labeling the imaages you write

  • format (str) – default ‘png’ format to write the file

  • overwrite (bool) – default False. if true can overwrite files in the path

Modifies:

Creates path folder if necessary and writes images to path

class pythologist.measurements.spatial.nearestneighbors.NearestNeighbors(*args, **kw)[source]

Indices and tables