Tuesday, February 28, 2012

Remove Overlapping Windows

A little bit of SAS code to remove dates that overlap by some amount (defined in a Macro Variable). Example from an event study with dates from SDC.

First off sort the data, removing any duplicates. In this case duplicates is determined just by the variables sorted on, so permco, permno, and date. This removes the problem of having more than one event on the same day.

Create a temporary table with lag and difference fields. The lag field is more for debugging as the difference field is all we care about. If the event is the first event for the company, leave the fields blank.

Create the output table and only keep events where the event date is either blank (first event) or if the time between events is greater than the window.

------ Code ------

mydata: the library where my crsp and cleaned sdc files are
sdc_permno: database with SDC data and permno's instead of cusips (already linked SDC and CRSP)
sdc_final: the output sdc data file with the event dates (non overlapping) and permnos
&beta_pre_length: The macro variable with the length of my pre window for calculating the betas in the market return model
&study_start_date: The macro bariable with the start date

/* SDC_FINAL: Clean up data and remove all SEOs with another SEO within window_prev before. */
PROC SORT DATA=mydata.sdc_permno OUT=tmp_sdc_permno NODUPKEY;
    BY permco permno fdate;
RUN;
DATA tmp_sdc_lag;
    SET tmp_sdc_permno;
    BY permco permno fdate idate shares shares_p shares_s;
    lag_date = LAG(fdate);
    dif_date = DIF(fdate);
    IF FIRST.permco THEN DO;
        lag_date = .;
        dif_date = .;
    END;
RUN;
PROC SQL;
    CREATE TABLE mydata.sdc_final AS
    SELECT monotonic() as count, permno, fdate, shares, shares_p, shares_s
    FROM tmp_sdc_lag
    WHERE fdate > &study_start_date and (dif_date EQ . OR dif_date > &beta_pre_len)
    ;
    DROP TABLE tmp_sdc_lag;
    DROP TABLE tmp_sdc_permno;
QUIT;

No comments:

Post a Comment