[1]:
%pylab inline
Populating the interactive namespace from numpy and matplotlib
[2]:
import os
import pyhf
import pyhf.readxml
from ipywidgets import interact, fixed

Binned HEP Statistical Analysis in Python

HistFactory

HistFactory is a popular framework to analyze binned event data and commonly used in High Energy Physics. At its core it is a template for building a statistical model from individual binned distribution (‘Histograms’) and variations on them (‘Systematics’) that represent auxiliary measurements (for example an energy scale of the detector which affects the shape of a distribution)

pyhf

pyhf is a work-in-progress standalone implementation of the HistFactory p.d.f. template and an implementation of the test statistics and asymptotic formulae described in the paper by Cowan, Cranmer, Gross, Vitells: Asymptotic formulae for likelihood-based tests of new physics [arxiv:1007.1727].

Models can be defined using JSON specification, but existing models based on the XML + ROOT file scheme are readable as well.

The Demo

The input data for the statistical analysis was built generated using the containerized workflow engine yadage (see demo from KubeCon 2018 [youtube]). Similarly to Binder this utilizes modern container technology for reproducible science. Below you see the execution graph leading up to the model input data at the bottom.

[3]:
import base64
from IPython.core.display import display, HTML
anim = base64.b64encode(open('workflow.gif','rb').read()).decode('ascii')
HTML('<img src="data:image/gif;base64,{}">'.format(anim))
[3]: