diff --git a/docs/index.rst b/docs/index.rst index aa49208a..69e1f240 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -93,6 +93,7 @@ If you have a question about the use of emcee, please post it to `the users list tutorials/autocorr tutorials/monitor tutorials/moves + tutorials/swmr License & Attribution diff --git a/docs/tutorials/swmr.ipynb b/docs/tutorials/swmr.ipynb new file mode 100644 index 00000000..827e8752 --- /dev/null +++ b/docs/tutorials/swmr.ipynb @@ -0,0 +1,480 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(swmr)=\n", + "\n", + "# Reading an HDF5 file while it is being written\n", + "\n", + "This tutorial will show how it is possible to read from an HDF5 file while it is being written by the `HDFBackend` without incurring into crashes of the process that is currently writing to the file.\n", + "\n", + "By default, an HDF5 file can be either read or written and it is not possible to do both at the same time. Trying to do so anyway might lead to crashes of the `emcee` process because the information in the file is not properly synchronized.\n", + "\n", + "**Important note**: the relevance of this effect depends on how much time is spent on writing relative to the time taken to perform a step. If saving takes a relatively small time, it is less likely that you will encounter crashes of the emcee process." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "%config InlineBackend.figure_format = \"retina\"\n", + "\n", + "from matplotlib import rcParams\n", + "rcParams[\"savefig.dpi\"] = 100\n", + "rcParams[\"figure.dpi\"] = 100\n", + "rcParams[\"font.size\"] = 20" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import multiprocessing as mp\n", + "import os\n", + "import time\n", + "import numpy as np\n", + "import emcee" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example of failing to write and read at the same time." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let us consider the following.\n", + "\n", + "We want to perform an MCMC on some process: in a script we define the log-probability function and do the set up of emcee in a function." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "def lnprob(x):\n", + " time.sleep(0.0001)\n", + " return 0.\n", + "\n", + "def writer():\n", + " nwalkers = 100\n", + " nsteps = 1000\n", + " if os.path.isfile('backend.h5'):\n", + " os.remove('backend.h5')\n", + " backend = emcee.backends.HDFBackend('backend.h5')\n", + " backend.reset(nwalkers,1)\n", + " sampler = emcee.EnsembleSampler(nwalkers,1,lnprob,backend=backend)\n", + " pos0 = np.ones(nwalkers) + ((np.random.random(nwalkers)-0.5)*2e-3)\n", + " print(pos0.shape)\n", + " sampler.run_mcmc(pos0[:, None],nsteps,progress=True,store=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We also have a script to read the chain from the HDF5 file." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "def reader():\n", + " backend = emcee.backends.HDFBackend('backend.h5',read_only=True)\n", + " chain = backend.get_chain()\n", + " print(chain.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we start the chain in the background, with the help of the `multiprocessing` module." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(100,)\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + " 16%|██████████▋ | 164/1000 [00:03<00:18, 44.11it/s]\n", + "Process Process-2:\n", + "Traceback (most recent call last):\n", + " File \"/home/ale/miniconda3/envs/emceefork/lib/python3.9/multiprocessing/process.py\", line 315, in _bootstrap\n", + " self.run()\n", + " File \"/home/ale/miniconda3/envs/emceefork/lib/python3.9/multiprocessing/process.py\", line 108, in run\n", + " self._target(*self._args, **self._kwargs)\n", + " File \"/tmp/ipykernel_17208/3509143556.py\", line 15, in writer\n", + " sampler.run_mcmc(pos0[:, None],nsteps,progress=True,store=True)\n", + " File \"/home/ale/Documenti/DOTTORATO/Progetti/emcee/src/emcee/ensemble.py\", line 438, in run_mcmc\n", + " for results in self.sample(initial_state, iterations=nsteps, **kwargs):\n", + " File \"/home/ale/Documenti/DOTTORATO/Progetti/emcee/src/emcee/ensemble.py\", line 405, in sample\n", + " self.backend.save_step(state, accepted)\n", + " File \"/home/ale/Documenti/DOTTORATO/Progetti/emcee/src/emcee/backends/hdf.py\", line 292, in save_step\n", + " with self.open(\"a\") as f:\n", + " File \"/home/ale/Documenti/DOTTORATO/Progetti/emcee/src/emcee/backends/hdf.py\", line 116, in open\n", + " f = h5py.File(self.filename, mode, libver=libver, swmr=swmr)\n", + " File \"/home/ale/miniconda3/envs/emceefork/lib/python3.9/site-packages/h5py/_hl/files.py\", line 444, in __init__\n", + " fid = make_fid(name, mode, userblock_size,\n", + " File \"/home/ale/miniconda3/envs/emceefork/lib/python3.9/site-packages/h5py/_hl/files.py\", line 211, in make_fid\n", + " fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)\n", + " File \"h5py/_objects.pyx\", line 54, in h5py._objects.with_phil.wrapper\n", + " File \"h5py/_objects.pyx\", line 55, in h5py._objects.with_phil.wrapper\n", + " File \"h5py/h5f.pyx\", line 100, in h5py.h5f.open\n", + "BlockingIOError: [Errno 11] Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')\n" + ] + } + ], + "source": [ + "writer_proc = mp.Process(target=writer)\n", + "writer_proc.start()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If for any reason you want to stop the reader, please uncomment the following cell and run it." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "#writer_proc.terminate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "While the chain is running, we want to check on the status of the chain, se we read it." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(50, 100, 1)\n" + ] + } + ], + "source": [ + "reader()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In case the function call above worked without causing problems, let us try repeating the read several times in succession. If instead the reader already crashed, then running the cell below is not needed and you should read further down." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(77, 100, 1)\n", + "Could not read from the backend.\n", + "Could not read from the backend.\n", + "(144, 100, 1)\n", + "(164, 100, 1)\n", + "The writer crashed after 5 tries.\n" + ] + } + ], + "source": [ + "imax=50\n", + "i=0\n", + "while writer_proc.is_alive():\n", + " i+=1\n", + " try:\n", + " reader()\n", + " except:\n", + " print(\"Could not read from the backend.\")\n", + " if i>=imax:\n", + " break\n", + " time.sleep(0.5)\n", + "if i=imax:\n", + " break\n", + " time.sleep(0.5)\n", + "if i