?? sqlfun09.sas
字號:
/****************************************************************/ /* S A S S A M P L E L I B R A R Y */ /* */ /* NAME: SQLFUN09 */ /* TITLE: fun/interesting applications of PROC SQL. (fun09) */ /* PRODUCT: BASE */ /* SYSTEM: ALL */ /* KEYS: SQL DATMAN SELECT WHERE GROUP ORDER COUNT MISSING */ /* TRANSPOSE PRINT DISTINCT */ /* PROCS: SQL */ /* DATA: */ /* */ /* SUPPORT: pmk UPDATE: */ /* REF: */ /* MISC: this example was contributed by Howard Schreier */ /* of the US Dept. of Commerce, via BITNET */ /* */ /* you can contribute your interesting samples. */ /* send internet email to KENT@UNX.SAS.COM or */ /* USmail to SAS Institute. */ /* */ /****************************************************************/ title1 'SAS SAMPLE LIBRARY, MEMBER(SQLFUN09)'; /* * A User asks: * * I have a survey on which firms listed their five most * important suppliers and their suppliers' locations. Now I * want to find out what percent of firms in the survey get * supplies from different places (e.g., what percent * get supplies from N.Y., from Boston, etc.). * * The five suppliers are separate variables. If I collapse * them by creating a new file with a separate OUTPUT statement * for each of the five variables, I get up to five times as many * firms as I really have (not all respondents listed as many as * five firms), so if a place appears only once, its relative * frequency should be 1/N where N is the actual number of firms, * but its computed relative frequency will be 1/M where M is the * number of supplier locations reported. * * On the other hand, if I do not create a separate file, the * five original variables are not compatible. * Firm 1 may have NY as the first variable, Firm 2 may have NY * as its third in the list, Firm 3 may list five different * cities, and Firm 4 may list five completely different cities. * * Does anybody have a suggestion for a quick and easy way to * compute what percent of all firms get their supplies from * the various cities? * */ /* * There are undoubtedly a number of different ways to do this, * but I think that transposing the data so that there is just * one city variable but multiple observations for each firm * will make it easier in any case. * * Here is a test data set, created in your configuration, then * rearranged by PROC TRANSPOSE: * */ data survey; infile cards; input company $6. +1 (city1-city5) ($8.); cards;First Chicago London ParisSecond Paris Tokyo Sydney Cairo TokyoThird Caracas Manila TokyoFourth Boston Prague Tokyo London ; proc transpose data=survey out=for_sql(rename=(col1=city)); by company notsorted; var city1-city5; run; proc print; title2 'Original survey data'; run; /* * The key to this query is the embedded subquery (the long * expression in parentheses). It creates a one-row temporary * table containing the number of companies, which is then * joined to the main table to set up the calculation you want. * * Notice that inclusion of the DISTINCT keyword in the * expression for CITY_PCT takes care of the repetition of * Tokyo by the Second Company. * * The WHERE clause gets rid of missing values generated by * PROC TRANSPOSE in cases where fewer than five cities are * named. * * The GROUP clause controls the rows spanned by the COUNT * summary function in the main (outer) query. * */ proc sql feedback; title2 'Supplier Percentages'; select city, 100 * count(distinct company)/co_count as city_pct from for_sql, (select count(distinct company) as co_count from for_sql) where city is not missing group by city order by city_pct desc, city; quit;
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -