我有300k的数据在excel中有区域列,(每个区域30k数据)。我将导入ttabthe文件到表中,并从1-10平均更新每个区域的数据。
Update table_name set region=region1 where createddate=(select * from table_name order by createddate desc where rownum=30000)
我是否需要为每个基于创建日期列的地区编写更新语句
?如果你正好有300k行,你只是想更新区域而没有任何特定的顺序,那么你可以使用MERGE
并在ROWID
伪列上关联:
MERGE INTO table_name dst
USING (
SELECT ROWID rid, CEIL(ROWNUM/30000) AS region
FROM table_name
) src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE SET region = src.region;
如果你想使用NTILE
(任何大小的数据集),那么:
MERGE INTO table_name dst
USING (
SELECT ROWID rid, NTILE(10) OVER (ORDER BY ROWNUM) AS region
FROM table_name
) src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE SET region = src.region;
我想ntile
可能会有所帮助。
这里有一个例子;我没有那么大的表,所以我将使用Scott的样例模式模拟它。我将统一地将行分成4个区域(对于这样一个小表来说,10个有点多)。
样本数据:
SQL> select * From test order by region, deptno, empno;
REGION DEPTNO EMPNO ENAME JOB
---------- ---------- ---------- -------------------- --------------------
10 7782 CLARK MANAGER
10 7839 KING PRESIDENT
10 7934 MILLER CLERK
20 7369 SMITH CLERK
20 7566 JONES MANAGER
20 7788 SCOTT ANALYST
20 7876 ADAMS CLERK
20 7902 FORD ANALYST
30 7499 ALLEN SALESMAN
30 7521 WARD SALESMAN
30 7654 MARTIN SALESMAN
30 7698 BLAKE MANAGER
30 7844 TURNER SALESMAN
30 7900 JAMES CLERK
14 rows selected.
设置区域(在您的情况下使用ntile(10)
):
SQL> merge into test a
2 using (select t.empno,
3 ntile(4) over (order by t.deptno, t.empno) region
4 from test t
5 ) b
6 on (a.empno = b.empno)
7 when matched then update set
8 a.region = b.region;
14 rows merged.
结果:
SQL> select * From test order by region;
REGION DEPTNO EMPNO ENAME JOB
---------- ---------- ---------- -------------------- --------------------
1 20 7369 SMITH CLERK
1 10 7782 CLARK MANAGER
1 10 7839 KING PRESIDENT
1 10 7934 MILLER CLERK
2 20 7902 FORD ANALYST
2 20 7566 JONES MANAGER
2 20 7876 ADAMS CLERK
2 20 7788 SCOTT ANALYST
3 30 7499 ALLEN SALESMAN
3 30 7521 WARD SALESMAN
3 30 7654 MARTIN SALESMAN
4 30 7844 TURNER SALESMAN
4 30 7900 JAMES CLERK
4 30 7698 BLAKE MANAGER
14 rows selected.
SQL>