How to add comments to a Delta Table in Scala?
Stack Overflow » Scala
by pgrandjean
3h ago
I would like to add comments to columns of an existing Delta table, without having to actually write SQL statements like "ALTER TABLE ALTER COLUMN". Is it possible to do it using only Scala ..read more
Visit website
Datawarehousing approach [closed]
Stack Overflow » Scala
by user6048082
12h ago
I am working on building/researching a datawarehousing solution for my firm. Requirements are: We can get close to 2mm records per day (100 columns). Users should be able to query/do analytics on at least 1 yr worth of data as quickly as possible. (2mm * 365 records). Most of the legacy code is done in Scala so any solution that has good support for Scala is also a plus... Database/Datawarehouse solutions (AWS based): RedShift RDS Aurora Hosting MySQL on an EC2 instance Datawarehouse solutions (Non-AWS based): Snowflake BigQuery Any suggestions? Thank you for your help ..read more
Visit website
How to truncate BQ Table from scala Spark
Stack Overflow » Scala
by Khilesh Chauhan
15h ago
We have a requirement where we need to truncate BQ table from Scala Spark. The idea behind is that every table column has description attached to it. If we overwrite table, the description no longer persist. We have explored various option like - .option("writeDisposition","WRITE_TRUNCATE") -- Unfortunately it didn't worked :( .mode("overwrite") -- This didn't preserve the description import com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.{BigQueryOptions, TableId} val bq = BigQueryOptions.getDefaultInstance().getService() val table = bq.getTable(TableId.of("project_id ..read more
Visit website
Spark UDF does not compute the final value on DataFrame, but it does on a test DataFrame
Stack Overflow » Scala
by cyberZamp
18h ago
I have a dataframe that is a list of Delta tables in the hive_metastore. For each table, I want to fetch the Delta Log to extract some information. I can do this by collecting the DataFrame in an Array and processing each row separately (sequentially or in parallel), but I am trying to test if using UDFs could speed up all or some parts. The UDF: import org.apache.spark.sql.functions.udf val udfGetDeltaLog = udf( (catalog: String, database: String, table: String) => { val deltaLog = try { Some(DeltaLog.forTable(spark, TableIdentifier(table, Some(database), Some(catalog ..read more
Visit website
Scala Spark: average of difference
Stack Overflow » Scala
by Jelly
18h ago
Given input dataframe with structure: | machine_id | process_id | activity_type | timestamp | | ---------- | ---------- | ------------- | --------- | | 0 | 0 | start | 0.712 | | 0 | 0 | end | 1.52 | | 0 | 1 | start | 3.14 | | 0 | 1 | end | 4.12 | | 1 | 0 | start | 0.55 | | 1 | 0 | end | 1.55 | The task is to calculate average time of process per machine. The solution is to calculate diffe ..read more
Visit website
Version mismatch for proj4-wrapper
Stack Overflow » Scala
by canpoint
21h ago
Currently, I am trying to run locally a gps to cartesian converter app using scala lang. During writing the output of parquet files. I got the folowing error; [TASK_WRITE_FAILED] Task failed while writing rows to file:/opt/spark-apps/2251. at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:774) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420) at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) at org.apache.spark.rdd.RDD ..read more
Visit website
How to use play framework macros to parse and process json?
Stack Overflow » Scala
by RobotEyes
21h ago
I am trying to convert json from one structure to another. In the process I would like to add a prefix to one of the string fields (if it exists). I need to use the play framework. How would I add-to/process json fields as they are parsed using macros? case class Book(title: Option[String], author: Option[String], published: Option[Long], info: JsObject) object Book { implicit val bookImplicitReads = Json.reads[Book] implicit val bookImplicitWrites = Json.writes[Book] } def bookConvertor(inputObject: JsObject) = { Try { val title = (inputObject \ "title").asOpt[String] val auth ..read more
Visit website
How to handle duplicated export of an identical symbol?
Stack Overflow » Scala
by Suma
1d ago
Following code imports the same symbol twice, because it imports two different objects which both export it: object Ext: def backwards(s: String): String = s.reverse object A: export Ext.* object B: export Ext.* import A.* import B.* backwards("Hello") The error is: Reference to backwards is ambiguous. It is both imported by import A._ and imported subsequently by import B._ It is the same symbol eventually, therefore there is in fact no ambiguity, but I guess some implementations details of export hide this from the compiler. How can I solve this? Motivation: In my project I ..read more
Visit website
Transform constructor names for Tapir Schemas
Stack Overflow » Scala
by David
2d ago
The JSON library 'Circe' has a Configuration param, transformConstructorNames: String => String, which allows you to transform the names of any class. Is there an equivalent for Tapir? The reason is that Tapir's Schema.derived[OpenAPISealedTrait] works great, but I want to drop the "OpenAPI" part from every class in the hierarchy. For example, if I have sealed trait OpenAPIShape case class OpenAPISquare(size: Int) extends OpenAPIShape case class OpenAPICircle(radius: Int) extends OpenAPIShape I want to be able to write Schema.derived[OpenAPIShape] with some config that drops ..read more
Visit website
"java.lang.NoSuchMethodError: 'scala.collection.JavaConverters$AsJava scala.collection" error when I stream kafka messages using Pyspark
Stack Overflow » Scala
by Nanomachines_Son
2d ago
I am in a bind here. I am trying to implement a very basic pipeline which reads data from kafka and process it in Spark. The problem I am facing is that apache spark shuts down abruptly giving the aforesaid error message. My pyspark version is 3.5.1 and scala version is 2.12.18. The code in question is :- from pyspark.sql import SparkSession from pyspark.sql.functions import * spark = SparkSession.builder \ .appName('my_app') \ .config("spark.jars", "/usr/local/spark/jars/spark-sql-kafka-0-10_2.12-3.5.1.jar") \ .getOrCreate() df = spark.readStream \ .format('kafka ..read more
Visit website

Follow Stack Overflow » Scala on FeedSpot

Continue with Google
Continue with Apple
OR