这更像是一个SQL问题而不是PostGIS问题,但我又
陷入了:(我有一个名为 reference 的表,在"发件人"和"to"列中带有 ID 号。 我想根据邮政编码计算所有这些 ID 号之间的距离。
有一个名为doc的单独参考表,其中包含"NPI"列中的ID号和"提供商企业邮寄地址邮政编码"列中的邮政编码,以及一个名为zctas的单独地理表,其邮政编码列为zcta和geom列。
例如,此查询工作正常:
SELECT z.zcta As zip1,
z2.zcta As zip2,
ST_Distance(z.geom,z2.geom) As thedistance
FROM zctas z,
zctas z2
WHERE z2.zcta = '60611'
AND z.zcta='19611';
一个问题是"提供商企业邮寄地址邮政编码"应 = left("提供商企业邮寄地址邮政编码",5)。
在这个查询中,我卡在连接参考表中的 2 个邮政编码上。
示例表:
参考表:
from | to | count
------------+------------+-------
1174589766 | 1538109665 | 108
1285653204 | 1982604013 | 31
期望的输出:
from | to | count | distance
------------+------------+----------------
1174589766 | 1538109665 | 108 | 53434
1285653204 | 1982604013 | 31 | 34234
\d+
Table "public.zctas"
Column | Type | Modifiers | Storage | Stats target | Description
------------------+------------------------+-----------+----------+--------------+-------------
state | character(2) | | extended | |
zcta | character(5) | | extended | |
junk | character varying(100) | | extended | |
population_tot | bigint | | plain | |
housing_tot | bigint | | plain | |
water_area_meter | double precision | | plain | |
land_area_meter | double precision | | plain | |
water_area_mile | double precision | | plain | |
land_area_mile | double precision | | plain | |
latitude | double precision | | plain | |
longitude | double precision | | plain | |
thepoint_lonlat | geometry(Point,4269) | | main | |
thepoint_meter | geometry(Point,32661) | not null | main | |
geom | geometry(Point,32661) | | main | |
Indexes:
"idx_zctas_thepoint_lonlat" gist (thepoint_lonlat)
"idx_zctas_thepoint_meter" gist (thepoint_meter) CLUSTER
Table "public.referred"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-----------------------+-----------+----------+--------------+-------------
from | character varying(25) | | extended | |
to | character varying(25) | | extended | |
count | integer | | plain | |
Has OIDs: no
Table "public.doc"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------------------------------------------------+------------------------+-----------+----------+--------------+-------------
NPI | character varying(255) | | extended | |
Entity Type Code | character varying(255) | | extended | |
Replacement NPI | character varying(255) | | extended | |
Employer Identification Number (EIN) | character varying(255) | | extended | |
Provider Organization Name (Legal Business Name) | character varying(255) | | extended | |
Provider Last Name (Legal Name) | character varying(255) | | extended | |
Provider First Name | character varying(255) | | extended | |
Provider Middle Name | character varying(255) | | extended | |
Provider Name Prefix Text | character varying(255) | | extended | |
Provider Name Suffix Text | character varying(255) | | extended | |
Provider Credential Text | character varying(255) | | extended | |
Provider Other Organization Name | character varying(255) | | extended | |
Provider Other Organization Name Type Code | character varying(255) | | extended | |
Provider Other Last Name | character varying(255) | | extended | |
Provider Other First Name | character varying(255) | | extended | |
Provider Other Middle Name | character varying(255) | | extended | |
Provider Other Name Prefix Text | character varying(255) | | extended | |
Provider Other Name Suffix Text | character varying(255) | | extended | |
Provider Other Credential Text | character varying(255) | | extended | |
Provider Other Last Name Type Code | character varying(255) | | extended | |
g(255) | | extended | |
Provider Second Line Business Mailing Address | character varying(255) | | extended | |
Provider Business Mailing Address City Name | character varying(255) | | extended | |
Provider Business Mailing Address State Name | character varying(255) | | extended | |
Provider Business Mailing Address Postal Code | character varying(255) | | extended | . . . . other columns not really needed.
谢谢!!!!
这应该相对简单。
假设 NPI 在doc
和referred
中实际上都是相同的长度,您可以非常轻松地连接这些表:
SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
bd."Provider Business Mailing Address Postal Code" as b_zip,
r."count"
FROM referred r
LEFT JOIN doc ad ON r."from" = ad."NPI"
LEFT JOIN doc bd ON r."from" = bd."NPI";
显然,根据对数据中NPI
和from
/to
字段的仔细分析来调整此联接。如有必要,在联接中添加trim
或left
方法调用 - 最重要的是JOIN
条件在可比较数据上。
现在,从此查询到原始查询以查找距离是微不足道的:
SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
bd."Provider Business Mailing Address Postal Code" as b_zip,
r."count",
ST_Distance(az.geom,bz.geom) As thedistance
FROM referred r
LEFT JOIN doc ad ON r."from" = ad."NPI"
LEFT JOIN doc bd ON r."from" = bd."NPI"
LEFT JOIN zctas az
ON az.zcta = left(ad."Provider Business Mailing Address Postal Code",5)
LEFT JOIN zctas bz
ON bz.zcta = left(bd."Provider Business Mailing Address Postal Code",5)
这只是一种应该可行的结构,许多其他结构都是可能的。此特定构造将确保表示referred
中的每个条目,即使NPI
与doc
表中的条目不匹配,或者邮政编码无法与zctas
表匹配也是如此。
另一方面,如果doc
表中存在多个NPI
条目,则提及此重复NPI
的任何referred
条目也将重复。
同样,如果特定邮政编码(zcta
)zctas
中有多个条目,您将看到referred
行的重复项。
这就是LEFT JOIN
的工作方式,但我认为值得加入警告,因为提供程序数据通常充满了针对 NPI 的重复项,并且邮政编码查找列表中经常有重复的邮政编码,因为某些邮政编码跨越州线。