I am using Statsmodel instead of STATA where possible, and wanted to cluster standard errors by firm. The problem I encountered was I use Patsy to create the endog/exog matrices, and statsmodel requires the cluster group Series to match length. (Aside: There's an open Github issue about this.) I'm sure there are more clever solutions, but mine was to give Patsy a dataframe with no missing data. The statsmodels documentation was a bit unclear, so I figured I'd share the working snippet below.
# Selection criteria
select_df = (df[(df['at']>1) & (df['ff12']!=8)]
.sort_values('cik y_q'.split()))
# Columns that appear in regressions, as well as group variable
cols = 'cik cp ni_at re_at xrd_at at y_q ff12'.split()
# Final dataframe with no missing data.
# This gets the patsy arrays and group series to have the same length.
reg_df = select_df.ix[select_df[cols].notnull().all(axis=1), cols]
mod = sm.OLS.from_formula('cp ~ ni_at + re_at + xrd_at + np.log(at)'
'+ C(y_q) + C(ff12)', reg_df)
res = mod.fit(cov_type='cluster', cov_kwds={'groups': reg_df['cik']})
# output results without F.E. dummies
print("\n".join([x for x in str(res.summary()).split('\n')
if 'C(' not in x]))