Connecting Superset

From Zymonic

Connecting Superset[edit]

"Apache Superset is a modern data exploration and visualization platform" - Zymonic Systems can be configured such that selected filters can be denormalised (all data cisible to the user in a filter stored in a single table for fast access and filtering) and single sign on to Superset can be achieved with Zymonic's OAuth server capability.

Choosing How To Host[edit]

Both Zymonic Application Server and Superset will need access to a common database (this can be configured as a separate database to Zymonic's main database using the 'denormalised database' system options). To use the Zymonic toolkit to install / configure Superset the system that Superset is installed on will need at least a 'minimal' Zymonic install that allows the Configuration to be loaded (can be done by NFS share / copying the XML dir) and access to the DB server that contains the system options - if desired/needed for security reason then this DB access can be disabled once the install is done and superset is running (but will be needed again if restarting Superset).

To allow Zymonic to automatically trigger the login of the user to Superset (using OAuth), Zymonic and Superset must appear to be coming from the same server and port - this can either be done with reverse pass proxying (the 'connect_superset' command will generate a recommended config) or can be done using mod_wsgi (not tried at this point) to run Superset as an Apache module.

Installation[edit]

The installation and connection is designed to be as automated as possible.

The first step is to use the Zymonic Installer Toolkit (or a wrapper e.g. zymobuild) to bring your system up to date and a 'config_build' of each system you wish to connect to Superset.

Next install the superset manifest:

  sudo zymonic_toolkit.pl Installer install --manifests superset

The installer will ask some additional questions related to Superset:

 Superset bin dir - The bin dir of superset (where superset_config.py resides) (currently set to )
 Superset bin dir: /home/superset/.local/bin/
 Superset group - The group that the superset server will run as. (currently set to )
 Superset group: superset
 Superset hostname - The hostname users will use to connect to Superset (currently set to )
 Superset hostname: md5.zednax.com
 Superset ip address - The IP address superset will listen on (currently set to )
 Superset ip address: 0.0.0.0
 Superset log directory - The superset log file - will be rotated daily (currently set to )
 Superset log directory: /var/log/superset
 Superset master system - The system of which the superusers are also Admin users of Superset (currently set to )
 Superset master system: ztsm
 Superset port - The port on which superset will listen (currently set to )
 Superset port: 8088
 Superset proxy path - The proxy path being used in apache (with no slashes) e.g. "analytics"
 Superset proxy path: analytics
 Superset SQLite file - The path of the Superset SQLite file (usually [sqlite home]/.superset/superset.db (currently set to )
 Superset SQLite file: /home/superset/.superset/superset.db
 Superset SSL cert - The path of the Superset SSL certificate file (currently set to )
 Superset SSL cert: /etc/ssl/wildcard/wildcard.zednax.com.2019.crt
 Superset SSL key - The path of the Superset SSL key file (currently set to )
 Superset SSL key: /etc/ssl/wildcard/wildcard.zednax.com.2019.key
 Superset user - The user that the superset server will run as. (currently set to )
 Superset user: superset

The values above are examples and may or may not be correct for you system.

Next ensure that the the Python and PIP present and correctly installed (Python needs to have been compiled with sqlite flag).

A determination then needs to be made about which user to run Apache Superset as - bear in mind that both the apache process and superset process will need to write to the Superset SQLite database. So far having a superset and an apache user with permissions on the SQLite DB allowing access to both users has been tested.

Once Python (with SQLite) and PIP are installed then run the following command as the selected user:

 zymonic_toolkit.pl Superset install_superset

It is possible that Python module requirements may not be properly installed (e.g. currently, on the 4th Jan 2021, PyJWT needs to be installed at a lower version that is current); if so they will need to be manually resolved and the install command re-run - this can be done a number of times if necessary, but .profile may need the extra lines removing from the end of it as duplicates will be created.

Connecting Superset and Zymonic[edit]

Firstly link the ZymonicSuperset.xml file into any systems that you wish to connect to Superset - the connect will fail with a message telling you to do this if you forget.

 sudo ln -s /root/trunk/modules/ZymonicSuperset/xml/ZymonicSuperset.xml /etc/zymonic/ztsm/
 

Then do a config build.

 sudo zymonic_toolkit.pl System config_build --system ztsm

Finally run the connect_superset command

 sudo zymonic_toolkit.pl Superset connect_superset --system ztsm --username alexm --password ! --hostname md4.zednax.com

This command will generate some Apache configuration - check with your system administrator where to put the Apache configuration fragment and that Apache has the Proxy Modules enabled.

Starting Superset[edit]

To start Superset run

  sudo zymonic_toolkit.pl Superset start_superset --system ztsm

This will start Superset in the foreground _and_ will show the full command being used - you will need to discuss with your sysadmin the best way of routinely starting / running Superset.

A successful start should look something like:

04-01-2021 13:39:27 - Running: sudo su - apache -c 'export LC_ALL=C.UTF-8; export LANG=C.UTF-8; PATH=${PATH}:/home/superset/.local/bin; /var/www/.local/bin//superset run  --cert "/etc/ssl/wildcard/wildcard.zednax.com.2019.crt"  --key "/etc/ssl/wildcard/wildcard.zednax.com.2019.key"  -h 0.0.0.0 -p 8088 --with-threads --reload  --debugger'
Loaded your LOCAL configuration at [/var/www/.local/bin/superset_config.py]
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
/var/www/.local/lib64/python3.6/site-packages/flask_caching/__init__.py:192: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled.
  "Flask-Caching: CACHE_TYPE is set to null, "
No PIL installation found
INFO:superset.utils.screenshots:No PIL installation found
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:werkzeug: * Running on https://0.0.0.0:8088/ (Press CTRL+C to quit)
INFO:werkzeug: * Restarting with stat
Loaded your LOCAL configuration at [/var/www/.local/bin/superset_config.py]
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
/var/www/.local/lib64/python3.6/site-packages/flask_caching/__init__.py:192: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled.
  "Flask-Caching: CACHE_TYPE is set to null, "
No PIL installation found
INFO:superset.utils.screenshots:No PIL installation found
WARNING:werkzeug: * Debugger is active!
INFO:werkzeug: * Debugger PIN: 228-075-511

Note that there is a warning about WSGI being a development server - see https://github.com/apache/incubator-superset/issues/8673 about alternate installs if this would be more appropriate to your set-up.

Creating a Source[edit]

A source is a denormalised table and a corresponding entry (and permissions) in Superset's configuration database. A source is created by going to the 'Superset Sources List' in the Zymonic Admin UI (via Admin Area or in the 'Zymonic' menu in desktop mode) and clicking add. The fields are documented in the UI and the process should look something like:

The source should then be available in Superset...

Creating a Chart[edit]

You can either navigate directly to the Superset UI through the host/port you configured during the install or each time you save a source there will be a link to Superset. Once in Superset you will create a chart - https://superset.apache.org/docs/creating-charts-dashboards/ is the official documentation, but bear in mind that the 'standard' implementation will be individual charts being embedded in Zymonic rather than a Dashboard and that the sources will be already created. There are other guides and documents that give a better description e.g. https://medium.com/datadriveninvestor/create-your-first-sales-dashboard-in-apache-superset-c6a7f3d628d6

Once you have create a chart you will have a preview screen like the following:

At the top is a button labelled '</>' which gives a pop-up with a fragment of HTML that will generate an iframe and a copy button - use the copy button to copy the content.

In the Zymonic UI open the 'Superset Charts List' (from the Zymonic Menu in desktop mode or the Admin Area) and create a new entry, then paste the html source into 'Chart Definition'. Fill in the remaining required fields including which role permission to apply to the chart and which Source was used to create the chart.

Once you have saved the chart you should be able to open it from the Desktop menu.

Adding Superset Charts to a System[edit]

Creating individual charts and sources 'by hand' as described in the previous two sections would be used for development (and in principle could be used to adding custom charts to a running system) - however most charts and source would be created using 'DefaultProcess' XML e.g.

   <DefaultProcess sequence="10">
       <Table>
           <ZName>zz_superset_source</ZName>
       </Table>
       <Where>
           <Clause>[zz_source_name] = 'Orders'</Clause>
       </Where>
       <ZymonicRequest>
           <zz_superset_source_autoprocess>
               <Transition>zz_superset_source_autoprocess_save_transition</Transition>
               <zz_superset_source_autoform>
                   <zz_source_name>Orders</zz_source_name>
                   <zz_source_filter>oc_orders_list</zz_source_filter>
                   <zz_source_dt_field>Date/Time</zz_source_dt_field>
                   <zz_source_update_freq>500</zz_source_update_freq>
               </zz_superset_source_autoform>
           </zz_superset_source_autoprocess>
       </ZymonicRequest>
   </DefaultProcess>
   
   <DefaultProcess sequence="20">
       <Table>
           <ZName>zz_superset_charts</ZName>
       </Table>
       <Where>
           <Clause>[zz_chart_zname] = 'sc_order_payment_types'</Clause>
       </Where>
       <ZymonicRequest>
           <zz_superset_charts_autoprocess>
               <Transition>zz_superset_charts_autoprocess_save_transition</Transition>
               <zz_superset_charts_autoform>
                   <zz_chart_name>Order Payment Types</zz_chart_name>
                   <zz_chart_zname>sc_order_payment_types</zz_chart_zname>
                   <zz_chart_json>{"queryFields":{"metrics":"metrics","groupby":"groupby"},"datasource":"2__table","viz_type":"area","url_params":{},"time_range_endpoints":["inclusive","exclusive"],"granularity_sqla":"Date/Time","time_grain_sqla":"P1D","time_range":"No filter","metrics":[{"expressionType":"SIMPLE","column":{"id":31,"column_name":"Total","verbose_name":null,"description":null,"expression":null,"filterable":true,"groupby":true,"is_dttm":false,"type":"DOUBLE","python_date_format":null,"optionName":"_col_Total"},"aggregate":"SUM","sqlExpression":null,"isNew":false,"hasCustomLabel":false,"label":"SUM(Total)","optionName":"metric_8nsft7o5y2_1r4l2rweuski"}],"adhoc_filters":[],"groupby":["Payment Types"],"order_desc":true,"row_limit":10000,"show_brush":"auto","show_legend":true,"line_interpolation":"step-after","stacked_style":"expand","color_scheme":"supersetColors","label_colors":{},"rich_tooltip":true,"bottom_margin":"auto","x_ticks_layout":"auto","x_axis_format":"smart_date","y_axis_format":"SMART_NUMBER","y_axis_bounds":[null,null],"rolling_type":"None","comparison_type":"values","annotation_layers":[]}</zz_chart_json>
                   <zz_chart_role_permission>zz_superset_charts_read_autorolepermission</zz_chart_role_permission>
                   <zz_chart_source>oc_orders_list</zz_chart_source>
                   <zz_chart_height>600</zz_chart_height>
                   <zz_chart_width>800</zz_chart_width>
               </zz_superset_charts_autoform>
           </zz_superset_charts_autoprocess>
       </ZymonicRequest>
   </DefaultProcess>

Note that the chart json is automatically extracted and decoded when creating charts via the Zymonic UI however when creating XML for DefaultProcesses the developer will need to manually extract the chart json and decode it or save the chart on a development system and take the decoded JSON from there.

Updating Sources[edit]

Updating sources, i.e., updating the denormalised data, is done by running:

 [Sun Jan 10 19:39:22] amasidlover@md4 ~ $ sudo zymonic_toolkit.pl Superset update_sources --system ztsm --username alexm --password !
 password: 
 10-01-2021 19:41:13 - Updated Orders List
 Method update_sources completed.

The source update will occur if it was last done longer ago than the update frequency configured on the source, if not the update will report:

 [Sun Jan 10 19:41:14] amasidlover@md4 ~ $ sudo zymonic_toolkit.pl Superset update_sources --system ztsm --username alexm --password !
 password: 
 10-01-2021 19:43:55 - Orders List not updated - last update was 2021-01-10 19:41:13, minimum seconds between updates is 300
 Method update_sources completed.

This should be scheduled via a cron job on a production system.

Once SR: 100035 is implemented the sources will be automatically updated whenever the filter is run and will only be updated if the denormalised data is older than the records in the filter's source tables.

Troubleshooting[edit]

Logs[edit]

Logs can be found in the location configured during the install in the example '/var/log/superset/superset.log' e.g.

  [Mon Jan 04 13:53:39] amasidlover@md4 ~ $ tail /var/log/superset/superset.log
    from authlib.integrations.flask_client import OAuth
ModuleNotFoundError: No module named 'authlib'
logging was configured successfully
No PIL installation found
 * Running on https://0.0.0.0:8088/ (Press CTRL+C to quit)
 * Restarting with stat
logging was configured successfully
No PIL installation found
 * Debugger is active!
 * Debugger PIN: 228-075-511

It is also possible to enable debugs for all OAuth requests made to the Zymonic server by setting the following system option:

 update zz_system_options set option_value = 'true' where option_name = "zz_oauth_debugging"

This will cause debugs (with an 'oauth' prefix in the filename) to be generated in the debug directory configured for the system.

SQLite[edit]

The SQLite DB can be accessed using

 sqlite3 [path to superset db]

The path will usually be the home dir of the user that superset was installed as, followed by .superset/superset.db.

Useful SQLite specific commands:

 .tables
 .schema [table name] 
 .quit

The key tables that Zymonic uses and updates are:

  • dbs - the list of denormalised DBs that Superset connects to
  • tables - the list of denormalised tables that Superset connects to
  • table_columns - the list of columns in the denormalised tables
  • ab_permission - the list of permissions that Superset can apply to its 'objects'
  • ab_view_menu - the list of 'objects' e.g. tables or DBs that can be viewed by users
  • ab_permission_view - a list of which permissions apply to which viewable objects
  • ab_role - a list of roles (Zymonic roles will be created in here as [system]_[Zymonic role]
  • ab_permission_view_role - a list of which roles have which 'permission view' permissions.

Resetting[edit]

If you ever need to reset zymonic and superset to start from "scratch", the following can be run.

zTSM db:

delete from zz_superset_source;
delete from zz_superset_charts;

superset sqlite db:

delete from table_columns;
delete from tables;
delete from ab_view_menu where name like '%[znames used for sources]%';
delete from ab_permission_view where view_menu_id not in (select id from ab_view_menu);
delete from ab_permission_view_role where permission_view_id not in (select id from ab_permission_view);