Ticket #118 (closed defect: fixed)
Data provider summary info
| Reported by: | igor / karen | Owned by: | joe |
|---|---|---|---|
| Priority: | normal | Milestone: | wait for funding |
| Component: | DataProvider | Version: | 1.2 |
| Severity: | normal | Keywords: | |
| Cc: |
Description
(logs from Jabber)
(2005 July 22)
(6:28 PM) isuarezsola: Karen the other day we were discussing about providing "group by" or "summary" search results .. E.g. like for observations that occur every 1 minute and you have tons of products the VSO might get cluttered with too much metadata ... so one posibility is to "bundle up" the metadata lets say one row of metadata for X minutes of products where X could be 15, 30 , 60 minutes etc ..and we would give a count of products found in that range
(6:29 PM) alisdair: Plus a couple of others - we just need to make sure we don't tread on each other's toes of get in the way of funding the VSO in the future!
(6:30 PM) isuarezsola: so my question is how difficult will be to implement something like that in your end?
In terms of proposals I think we are in great shape but yes we need to do some thinking, brainstorming too ??
(6:32 PM) alisdair: Yes.
(6:39 PM) ktian: igor, the answer to your question is: it depends what you want to do. it's going to be a major problem for hmi which is going to produce data volume eq. that of all solar data combined in a few month.
we will have metadata for each individual image. it's matter of selecting 'representative' data records to show.
(6:42 PM) isuarezsola: right, something similar will happen with the SOLIS FDP instrument. That's why going forward (actually short term) we need to find a way to condense the metadata shown in the GUI ... a "representative" data record as you put it ..
(6:45 PM) alisdair: Would it not be easier for those instruments that need it, to define (and serve) a summary product, rather than us trying to create a general way to do this in the GUI?
(6:45 PM) isuarezsola: one possibility would be for the provider to give different levels of "condensation", or "group by" or "summary" , what I have in mind is some clever SQL that gets you there ... however the "representative" data sounds more interesting ... what do you have in mind? to add a record of metadata every x products (like in a checkpoint record) or something like that?
A, yes that's a good idea .. not everybody will implement that ... but the summary product thing will have to have some standards ..
(6:47 PM) alisdair: Yeah ... a different standard for every data provider!
(6:49 PM) isuarezsola: the provider will implement it in the way they want but we need to find some standards in our end ... aren't we?? please ..?? ;-)
(6:49 PM) ktian: we probably still need to provider some general guidelines for these metadata digest in order for them to be meaningful.
(6:50 PM) alisdair: No - you are right we would need to provide some input into this, but I think those that know the data best should provide the summary products they thing a community needs.
(6:50 PM) isuarezsola: Ok, that makes a lot of sense yes ...
(6:50 PM) alisdair: Yes - and we still have to work out how to define "SUMMARY_PRODUCT" in the appropriate way!
(6:50 PM) ktian: i agree.
(6:51 PM) isuarezsola: which goes in the way of Karens 'representative" records
(6:51 PM) alisdair: Yep
Damn I think we are all agreeing here!
(6:52 PM) isuarezsola: we have to understand it has been a long week and we are all tired ;-) yes but I like to know more about the 'representative' data ...
... we can leave it for monday though ...
(2005 July 23)
(5:44 PM) ktian: yes, it's up to provider to decide the kind of summary data to produce. the vso however needs a framework/model to describe various forms of summary data, like what the vso data model does to the provider data holdings. there probably will be different categories of summary/digest data: downsample, image of the day, etc. for each of such catalogy, we need standard ways to describe how the summary/digest is made.

It's implemented, but it's provider-specified right now, and only if the provider thinks it needs to.
If we want to implement this on user-request, we should open a new ticket.