Thursday, June 9, 2016

Ipython parallel local vs. engine execution

TL;DR: using a lambda in the map on a ipyparallel View will obviate loading the function locally.

I've always used %%px --local to do parallel processing in Python. But recently I wanted to throw all my code in a python file, then just have a short notebook that essentially just kicked off the processes and wrote the results to disk. So I tried this:
#In [1]:
from ipyparallel import Client
IP_client = Client()
IP_view = IP_client.load_balanced_view()

# In [2]:
%%px
import sys
sys.path.append('.../code/')
from myresearch import analyze_multiple_ciks

#In [3]:
N = len(IP_client.ids) # or larger for load balancing
_gs = [df[(df.cik > (_d.cik.quantile(i/N) if i else 0))
          &(df.cik <= df.cik.quantile((i+1)/N))]
      for i in range(N)]

#In [4]:
res = IP_view.map(analyze_multiple_ciks, _gs)
However this doesn't work. The reason is the IP_view.map; it's looking for analyze_multiple_ciks locally, which we haven't loaded. So wrapping that function to defer its referencing seems to work:
#In [4]:
res = IP_view.map(lambda x: analyze_multiple_ciks(x), _gs)
Perhaps this was obvious, but I couldn't find much online about it. Also I do the chunking manually in In[3] because I've found using ipython to queue 23,000 tasks is really slow. So I wrap my code in an 'analyze_multiple' function and reduce the queue length considerably. Maybe that's not still a problem in the updated ipyparallel, but it's how I've always done it.