Author Topic: 15 Beta Data Miner Feature expieriments  (Read 918 times)

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
15 Beta Data Miner Feature expieriments
« on: June 14, 2019, 05:50:26 pm »
I have been exploring the Data Miner Diagram feature in the latest beta. From my limited experimentation, and EA experience, i suspect this feature will see allot of use. I suggest those of you with a 15 Beta experiment with it, so that you will not have to wait for the next release for changes when you see how useful it is.

It seems to be an off the shelf tool for importing model-to-model mapping and syncing information from other systems in your IT Landscape. I have yet to get a diagram working, the documentation is a bit thin, and the model wizards (as yet) do not have common tasks such as 'mine CSV file' or 'mine JSON REST URL' examples.

I've created this thread for others experimenting to share their experience, and report any issues or tips.

If you need an URL for a JSON REST API, you can use github: https://api.github.com/
If you need a URL for a CSV file, you can use: https://www.rba.gov.au/statistics/tables/csv/a6-data-prior-to-1984.csv?v=2019-06-14-09-44-55
Or find something else here: https://github.com/n0shake/Public-APIs

Current documentation on this feature:
https://sparxsystems.com/enterprise_architect_user_guide/15.0/model_publishing/data_miner.html

I expect it uses the new "Dynamic Model Add-Ins" mentioned here:
https://www.sparxsystems.com/products/ea/15/history.html
And documented here.
https://sparxsystems.com/enterprise_architect_user_guide/15.0/automation/modeladdins.html

The transformation from source to set is done with a script that uses the DataMiner Package. The new DataMiner package docs are here:
https://sparxsystems.com/enterprise_architect_user_guide/15.0/automation/data_miner_package.html

Questions i have yet to answer.
- can i put together multiple DMConnections, sets, and scripts in a single diagram to collate information?
- can i use a DMscript (as a DMconnect in the workflow) to create DMsets?
- can i, as a file DMConnect source, trigger the creation of the external CSV/Json files (e.g. generate via external scripts such as in LINQpad)  ?
- Are there examples on how to tie these 'diagram workflows' to Automation model based triggers.
- Can i add a CSV exchange specification (Publish->Model Exchange-> CSV -> CSV Import Export) as a DMScript?
- Can i not use my local repository for EA Repository source (if i want to mine my own repository to transform elements)?
- How do i manage per-user credentials and admin credentials defined in a DMConnection element in a Diagram?
- How do i process a directory of files in the file processor?
« Last Edit: June 14, 2019, 08:24:49 pm by timoc »

Paolo F Cantoni

  • EA Guru
  • *****
  • Posts: 6869
  • Karma: +147/-104
  • Inconsistently correct systems DON'T EXIST!
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #1 on: June 14, 2019, 06:13:08 pm »
Thanks timoc, for starting this thread.  I had a quick look at the documentation this morning and, as you say, looks like it could be useful.

Don't have time to experiment, as yet, but will be following the thread with interest.

Paolo
Inconsistently correct systems DON'T EXIST!
... Therefore, aim for consistency; in the expectation of achieving correctness....
-Semantica-
Helsinki Principle Rules!

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #2 on: June 14, 2019, 09:23:21 pm »
Spent an hour or so playing with the REST API of our issue tracking system.

The DMConnection (URL) fails silently, and has no 'test connection' button, So you need to check your URL works before you go on to the next piece.
The DMConnection does not tell you what data source type it is, so you should name your connection element, not just the diagram.
The DMConnection element is READ ONLY. The consequence of this is that:
- I suspect i would have to dynamically create and destroy connection objects to make REST calls from inside the script for linked data. Not sure how to (or if it is possible to) dynamically create DMConnection objects via the DataMiner Package. If it could, I cannot use the credentials stored in the object for subsequent REST calls.
The DMConnection credentials are embedded in the diagram element properties. If user specific credentials are needed, then... not sure.

The DMSets 'Open Visualisation' only works when debugging a DB DMConnection type (uses the DMArray class)

The example DMScripts created by the model wizard are slightly different for each connection type.

The DataMiner pipeline is read only, so even though you can use HTTP POST and PUT URLs with DMConnection (and its credentials), you cannot change the post properties or URL.

« Last Edit: June 14, 2019, 09:46:28 pm by timoc »

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #3 on: June 14, 2019, 10:13:17 pm »
The DMSets does not automatically recognize JSON data, or i cannot see how to make it do so, as the DMSet is an intermediate Read Only  type used to feed the DMscript.

In DMscript, it is not obvious to me how to determine the repository context the script is running in. There seems to be no link from the Data Miner package object, to the diagram or package it is in. This makes 'relative' element generation tricky. I am not sure how to discover the script context DMScript element ID, or what to do with it if i can find it.

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #4 on: June 14, 2019, 10:49:51 pm »
Development REPL is tedious and error prone.

It is faster to develop as a function in a script via the 'Specialise->Scripting' editing environment, which at least has a debugger. Easier by far to create a JSON object that contains a collection of objects representing the information you want to put into your model in a test script, so you can unit test the mining before connecting it to the data source.

Giuseppe Platania

  • EA User
  • **
  • Posts: 53
  • Karma: +2/-0
  • As above so below
    • View Profile
    • Linkedin
Re: 15 Beta Data Miner Feature expieriments
« Reply #5 on: June 18, 2019, 01:31:59 am »
thank you for testing this, let us know more
Giuseppe Platania
Enterprise Architect
+++ +++ +++ +++ +++ +++ ++++
"As above so below, to accomplish the miracles of the One Thing" - Hermes Trismegistus, first Enterprise Architect

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #6 on: June 18, 2019, 09:07:24 pm »
thank you for testing this, let us know more
I have put it on hold for a day or two, until i have spare time to:
1- Dig out how to relate a dataminer object back to the diagram that it is instantiated in. Looking at the ScriptingEA ebook  and Geert's VB script archive for inspiration.
2- Get a working Javascript BDD/TDD REPL together - currently i think i have a shot at getting Jasmine working in the new Spidermonkey JS60 runtime. This is proving to be more fun than playing with Datamining and EA internals, so has the highest priority ;)

Note, i am mostly interested in JSON URL based mining using Javascript. There seems to be existing scripting in the Javascript/VBscript library for CSV processing, but have not played with it. I suggest others look into database and CSV if that fits their needs.

For others who may have interest in playing with the new Javascript engine, i have had some limited sucsess with browserify and nodify for simple modules. I think eventually browserify with an EA source transform is also a viable option for including node modules for EA scripting support.

Ian Mitchell

  • EA User
  • **
  • Posts: 295
  • Karma: +10/-1
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #7 on: June 19, 2019, 09:36:45 pm »
Do we think this is intended to be a general-purpose framework for pulling external data into EA? Such as lists of tasks from anexternal task manager tool, or lists of components from a CMDB tool?
If not, what is it for ?
Ian Mitchell, Designer, eaDocX


www.eaDocX.com
www.theartfulmodeller.com

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #8 on: June 19, 2019, 10:51:25 pm »
Do we think this is intended to be a general-purpose framework for pulling external data into EA? Such as lists of tasks from anexternal task manager tool, or lists of components from a CMDB tool?
If not, what is it for ?
In my honest opinion, General purpose is pushing it, In its current form it is pretty limited. It has the potential to be truly useful.

At the moment, It can be seen as a way of collecting and processing pre-collated information from various external data-sources who's responsibility it is to maintain that data. It can possibly update the model provided the data has some form of ID. It is a one way street, so without a feedback mechanism you cannot annotate/update those systems with information from your model.

Pre-collated is because there is no flexibility in the EA mining side query - it is fixed in the DataMiner model. It is passing data to your script as either a read only string, or an  'Array' like resultset from the SQL query which i guess is also read only. If you want to query for apples and pears, you will have to create separate DMactions and merge them in the script. Credential management, and other data-source connection related things are also limited.

What it is for?
- Model update - e.g. ip-addresses,
- Project information update - For example, dates and times or the Kanban properties on elements.

Who is it for? Small teams, with eap/feap repositories, or one off/personal repositories. It is a manual client side data sync tool. So imagine a small team where everyone has admin privileges, need to update model information from a data-source, and want to share this as an MDG addin for re-use. It is trying to not-compete with the cloud product - which uses automated server side model sync.

To be truly usefull it should allow for true model sync for eap/feap small teams and re-usable in the cloud product for server side automated sync.




Guillaume

  • EA User
  • **
  • Posts: 816
  • Karma: +22/-0
    • View Profile
    • www.umlchannel.com
Re: 15 Beta Data Miner Feature expieriments
« Reply #9 on: June 19, 2019, 11:39:53 pm »
Hi,

I've also been testing this new feature in the recent weeks as it looks interesting. For instance, I've seen data models delivered in JSON or other formats that required going through a migration to Excel prior to import it into EA. Using a script with a quick access on a data source with available parsers would be useful.
Of course this is a one way / import only.

For instance I've successfully imported details from an Excel spreadsheet and started looking at importing ODATA XML file (as an alternative to the procedure shared here: http://www.umlchannel.com/en/enterprise-architect/item/264-import-odata-metadata-sparx-enterprise-architect-database-schema).

I'm planning to share my results alongside other EA15 features I'm playing with..
Guillaume

Blog: www.umlchannel.com | Free utilities addin: www.eautils.com


Ian Mitchell

  • EA User
  • **
  • Posts: 295
  • Karma: +10/-1
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #10 on: June 20, 2019, 06:26:06 pm »
Perhaps we could encourage the author of this new feature to include a set of worked examples, for typical scenarios? Guillaume's example of importing an XML file would be a great start, as would JSON data, and a simple CSV.
The example should include not just how to code and run such a process, but also
  • how to use EA to analyze the proposed data - so take e.g. the XML, and show the user the schema that it seems to obey (if the schema isn't available for normal import)
  • how to use that analysis to create the import code
  • how transform incoming data to to the required format for EA
  • how the process would tell the user about errors in the incoming data,
  • and how it would both create new  - and change existing - EA elements as a result of the import
.
In other words, each part of a normal ETL (Extract, Transform, Load) process. And whilst we doing all this, get the Sparx documentation use 'ETL' language, as that's what people will recognize - this isn't something Sparx dreamed-up by themselves (I hope...)

If we had such a facility, then it becomes much easier to promote EA as an information hub for data from multiple sources, where EA is the thing which makes sense of the links between widely spread data, and so delivers useful new insights.
A warning though - this is a well-walked path, and there are several tools out there which already do this in a very sophisticated way, so lets have a fully thought-out end-to-end solution which shows how EA can play in this game.
Ian Mitchell, Designer, eaDocX


www.eaDocX.com
www.theartfulmodeller.com

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #11 on: June 20, 2019, 08:08:04 pm »
Perhaps we could encourage the author of this new feature to include a set of worked examples, for typical scenarios? Guillaume's example of importing an XML file would be a great start, as would JSON data, and a simple CSV.
The example should include not just how to code and run such a process, but also
  • how to use EA to analyze the proposed data - so take e.g. the XML, and show the user the schema that it seems to obey (if the schema isn't available for normal import)
  • how to use that analysis to create the import code
  • how transform incoming data to to the required format for EA
  • how the process would tell the user about errors in the incoming data,
  • and how it would both create new  - and change existing - EA elements as a result of the import
.
In other words, each part of a normal ETL (Extract, Transform, Load) process. And whilst we doing all this, get the Sparx documentation use 'ETL' language, as that's what people will recognize - this isn't something Sparx dreamed-up by themselves (I hope...)

If we had such a facility, then it becomes much easier to promote EA as an information hub for data from multiple sources, where EA is the thing which makes sense of the links between widely spread data, and so delivers useful new insights.
A warning though - this is a well-walked path, and there are several tools out there which already do this in a very sophisticated way, so lets have a fully thought-out end-to-end solution which shows how EA can play in this game.
I agree with the above - Documentation on how and why to use the feature would be great.

I honestly hope ETL is not the only usecase! In the modern world of SOA architectures, and distributed responsibilities, keeping EA as the end of any pipeline does not make sense to me. Yes it is an ETL pipeline, but if someone wants to use this to make a 'master database' as an integration point for an Enterprise (let alone a data warehouse in the fixed EA datamodel or EAP file!), then the 80's called, and they want their Delorian back :)

Ian Mitchell

  • EA User
  • **
  • Posts: 295
  • Karma: +10/-1
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #12 on: June 20, 2019, 08:28:28 pm »
I'm not suggesting EA as a master DB for everything.
Just the stuff which people need to base their modelling on real-world data.
Like servers and data links, components and interfaces, requirements from other tools, etc etc.
Ian Mitchell, Designer, eaDocX


www.eaDocX.com
www.theartfulmodeller.com

timoc

  • EA User
  • **
  • Posts: 108
  • Karma: +6/-0
    • View Profile
Re: 15 Beta Data Miner Feature expieriments
« Reply #13 on: June 21, 2019, 07:41:24 am »
I'm not suggesting EA as a master DB for everything.
My appologies. Beleive it or not 'central database as integration silver bullet' is still a pattern i come across proponents of.

Just the stuff which people need to base their modelling on real-world data.
Like servers and data links, components and interfaces, requirements from other tools, etc etc.
EA repository setup and maintaiance is my major use case too. i would also like to be able to have changes made to 'mined' elements to be fed back to the repsonsible system. Using EA in scrum teams with the new Kanban diagrams for example.

Guillaume

  • EA User
  • **
  • Posts: 816
  • Karma: +22/-0
    • View Profile
    • www.umlchannel.com
Re: 15 Beta Data Miner Feature expieriments
« Reply #14 on: June 21, 2019, 05:15:34 pm »
From what I've seen, the aim of importing a data model is not indeed linked with a master DB, but rather to cater for the following:
- Produce manageable, clear views of the data model with visual diagrams
- Understand the schema especially with JSON or the likes i.e. not managed in a RDBMS
- Establish the mapping if needed with dependent applications or data models
- Link it with the process model by associating BPMN Data Objects with the imported Classes

I published my EA15 beta preview here: http://www.umlchannel.com/en/enterprise-architect/item/290-sparxsystems-enterprise-architect-15-beta-preview.
It includes my feedback on the Data Miner.
Guillaume

Blog: www.umlchannel.com | Free utilities addin: www.eautils.com