About 10,000,000 results
Open links in new tab
  1. apache spark - How to set catalog and database with pyspark.sql ...

    Oct 18, 2024 · 1 I want to create a spark session w/ pyspark and update the session's catalog and database using the spark config, is this possible? Using config isn't working I tried to update …

  2. How to insert, update rows in database from Spark Dataframe

    Nov 17, 2021 · Where (1, 11) was updated (2, 22) was inserted (3, 33) wasn't changed I guess there are two possible solutions: Merge data in new DataFrame and fully rewrite table in …

  3. apache spark - How to show all tables in all databases in …

    Aug 30, 2020 · The output is a Spark SQL view which holds database name, table name, and column name. This is for all databases, all tables and all columns. You could extend it to have …

  4. pyspark - Can Apache Spark be used as a database replacement?

    Feb 7, 2020 · In fact, even for normal applications' SQL queries, you should prefer a database because Spark can be an inefficient alternative for typical, random access queries (it …

  5. How to perform an upsert (insert - Stack Overflow

    Nov 21, 2023 · I am trying to do an upsert from a pyspark dataframe to a sql table. sparkdf is my pyspark dataframe. Test is my sql table in an azure sql database. I have the following so far: …

  6. How to use JDBC source to write and read data in (Py)Spark?

    Jun 22, 2015 · How to optimize partitioning when migrating data from JDBC source?, How to improve performance for slow Spark jobs using DataFrame and JDBC connection? How to …

  7. apache spark - Overwrite is failing with "pyspark.errors.exceptions ...

    Jun 4, 2025 · I upgraded PySpark from 3.5.5 to 3.5.6, and now all unit tests with an overwrite operation are failing with this error: pyspark.errors.exceptions.captured.AnalysisException: …

  8. How to see all the databases and Tables in Databricks

    Sep 22, 2020 · [ (table.database, table.name) for database in spark.catalog.listDatabases() for table in spark.catalog.listTables(database.name) ] to get the list of database and tables. EDIT: …

  9. apache spark - Using pyspark to connect to PostgreSQL - Stack …

    I am trying to connect to a database with pyspark and I am using the following code:

  10. How to Read Data from DB in Spark in parallel - Stack Overflow

    Feb 1, 2021 · 9 Saurabh, in order to read in parallel using the standard Spark JDBC data source support you need indeed to use the numPartitions option as you supposed. But you need to …