Java.io.IOException: While processing file s3://test/abc/request_dt=2021-07-28/someParquetFile. [XYZ] BINARY is not in the store
Databricks Community Forum » Hadoop
by zeyk
2y ago
Hi Team, I am facing an issue "java.io.IOException: While processing file s3://test/abc/request_dt=2021-07-28/someParquetFile. [XYZ] BINARY is not in the store" The things i did before getting the above exception: 1. Alter table tableName1 add columns(xyz string); 2. ALTER TABLE tableName1 RECOVER PARTITIONS; -- Recovered the new partitioned equest_dt=2021-07-28 3. INSERT OVERWRITE TABLE tableName2 PARTITION (request_dt) SELECT old_column_names, XYZ from tableName1 And hence the the third command causes the Exception Any suggestions would be of great help. Thanks in Advance ..read more
Visit website
Getting error after: hdfs namenode - format
Databricks Community Forum » Hadoop
by bghose
3y ago
Hi, I am installing hadoop (3.2.2) on Ubuntu 18.04, and for the first time. At the end of the installation when I run 'hdfs namenode -format' it shows: ERROR: Invalid HADOOP_COMMON_HOME So I am wondering whether this is a common error and how to correct it. Please let me know. Thanks in advance ..read more
Visit website
Is it possible to install Apache Hadoop 3.3.1and Apache HBase 2.4.2 Via Ambari?
Databricks Community Forum » Hadoop
by aomer
3y ago
We are considering canceling our support from Cloudera as there are budgeting issues. We currently use hadoop v3.1.5 and hbase v2.6 HDP version. When we installed the HDP cluster we install all of its componets Via HDP repos ( VDF file ) and Ambari installed everything. is there a VDF file we can use to pull the repos for Apache version (opensource ) for use in Ambari to install the cluster ..read more
Visit website
Need Installer of bigsql Service for CDH6.3.2 Hadoop Cluster , Please help to find same
Databricks Community Forum » Hadoop
by keshwalnitin23
3y ago
I need to install bigsql service on CDH 6.3.2 Hadoop Cluster . please help to find correct installer for same. Thanks in Advance ..read more
Visit website
Can some one suggest best practices for migrating from Hadoop to Databricks on AWS
Databricks Community Forum » Hadoop
by DJAY
3y ago
Can some one suggest best practices for migrating from Hadoop to Databricks on AWS ..read more
Visit website
Hadoop Client version 3.2.1 vulnerability
Databricks Community Forum » Hadoop
by laszloczol
3y ago
I'm having a problem using hadoop-client version 3.2.1 in my dependency tree. It has a vulnerable jar: org.apache.hadoop : hadoop-mapreduce-client-core : 3.2.1 The code for the vulnerability is: CVE-2017-3166, basically if a file in an encryption zone with access permissions that make it world readable is localized via YARN's localization mechanism, that file will be stored in a world-readable location and can be shared freely with any application that requests to localize that file The problem is that: if I'm updating for the 3.3.0 hadoop-client version the vulnerability remains. Does anybody ..read more
Visit website
Having trouble running Hadoop services
Databricks Community Forum » Hadoop
by Aman Kr Sharma
3y ago
I have installed Hadoop on my system but when I tried to run hadoop services using start-all.sh in terminal. I get this issue: zsh: command not found: start-all.sh ..read more
Visit website
Dynamic Querying Hadoop
Databricks Community Forum » Hadoop
by chaithudbd
3y ago
Hi Team Is there any way in Hadoop or Spark or any other components which enables dynamic querying on Big Data. For example i have 3 TB of data in HDFS. I wanted to build an application which enables users to choose there filters or predicates on their own and build a query and get the result in real time or near real time Example in detail : I have an Employee data of size 3 TB in HDFS. I have created hive external partitioned tables on top of this pointing to hdfs files. so here the goal is to enable users to choose the required data by filtering or selecting or ordering required columns. ne ..read more
Visit website
Unexpected arguments error appearing on the command line when running mapreduce job (MRjob) using python
Databricks Community Forum » Hadoop
by amackrach06
3y ago
I am fairly new to this process. I am trying to run a simple map-reduce job using python 3.8 with a csv on a local Hadoop cluster (Hadoop version 3.2.1). I am currently running it on Windows 10 (64-bit). The aim of what I'm trying to do is to process a csv file where I will get an output of a count representing the top 10 salaries from the file, but it does not work. When I enter this command: $ python test2.py hdfs:///sample/salary.csv -r hadoop --hadoop-streaming-jar %HADOOP_HOME%/share/hadoop/tools/lib/hadoop-streaming-3.2.1.jar The output reports an error: <code>No configs foun ..read more
Visit website
How to Read a parquet file , change datatype and write to another Parquet file in Hadoop using pyspark
Databricks Community Forum » Hadoop
by harrykrishs
3y ago
New to Pyspark.. My source parquet file has everything as string. My destination parquet file needs to convert this to different datatype like int, string, date etc. How do I do this ..read more
Visit website

Follow Databricks Community Forum » Hadoop on FeedSpot

Continue with Google
Continue with Apple
OR