Vote count:
0
I am having issues creating a partitioned table in Hive that contains the ":" and "/" special characters as values in the partition column name. I found this JIRA The partitions themselves get created, but I cannot access their data files on HDFS from Hue, nor can I see their results in the partition viewer in Hue. I found a related JIRA, but the issue was fixed and committed in version 0.6.0 (I'm on 0.13)
Here is what my raw data looks like:
1241241;Joe Schmo;TAXI DRIVER;08/06/2015;01/04/2014;13:20
My temporary staging table is created like this:
CREATE EXTERNAL TABLE IF NOT EXISTS nyc_cab_drivers
(
License_Number bigint,
Name string,
Type string,
Expiration_Date string,
Last_Updated_Date string,
Last_Updated_Time string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ';'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '...'
TBLPROPERTIES (
'skip.header.line.count'='1'
);
My partitioned table definition is:
CREATE TABLE IF NOT EXISTS part2_nyc_cab_drivers
(
License_Number bigint,
Name string,
Type string,
Expiration_Date string,
Last_Updated_Date string
)
PARTITIONED BY (Last_Updated_Time string)
CLUSTERED BY (Name) INTO 128 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n';
Then I load the data into the partitioned table as follows:
FROM nyc_cab_drivers ncd
INSERT OVERWRITE TABLE part2_nyc_cab_drivers
PARTITION (Last_Updated_Time)
SELECT ncd.License_Number, ncd.Name, ncd.Type, ncd.Expiration_Date, ncd.Last_Updated_Date, ncd.Last_Updated_Time;
For this example, I have used dynamic partitioning but the same issue happens when I use static partitioning as well (i.e. Last_UpdatedTime="12:41"). After running the load, the partitions show up in HUE. I can see the partition columns, and the special characters are escaped by hex like this: "12%3A41". When I click on the partition value, it opens up the partition but says "The operation has no results." under results. When I click on the path, Hue tells me:
Cannot access: /user/hive/warehouse/http://ift.tt/1GkAYRA.
IllegalArgumentException: Pathname /user/hive/warehouse/http://ift.tt/1GkAYRA from /user/hive/warehouse/http://ift.tt/1GkAYRA is not a valid DFS filename. (error 400)
I see the directory created in HDFS for each partition, I just can't open it to view the datafiles. When I do a show partitions query, I see the partitions listed which means they were created, just something to do with the HDFS names. This same issue happens when I change the partition column to the date values that have "/" characters - the only difference is I can open the partition directory and view their data files but it still shows no results for the partition.
Any help would be greatly appreciated!
hive partition name with special character
Aucun commentaire:
Enregistrer un commentaire