I thought this would be of interest to the people here. Not the API
discussion, the RFP ---------- Forwarded message ---------- From: Rhiannon Coppin <[hidden email]> Date: Tue, Mar 20, 2012 at 5:48 PM Subject: Re: [OpenDataBC] "Publishing Open Data - Do you really need an API?" To: [hidden email] I've been following this discussion.. for me (as a journalist) an API only makes sense for yes, large data sets where I only want a small fraction (map information, twitter bits, etc.) or when the data represents something ongoing, real-time, like crime occurrences that are reported daily/weekly or other events that happen sporadically -- like board decisions on doctor misconduct or workplace safety violations. On that note, I wan't to show the list a current RFP (1 week left) for "a data exploration and visualization tool" for WorkSafeBC, which is posted on bcbid.gov.bc.ca WorkSafeBC wants a system to tie into existing database systems and provide levels of access for analysts, developers, and for "consumers." (Is that the public? Is this an OpenData problem/opportunity?) Anyhow, I can't link to the document online (session-driven site), but I will quote from it here and also try to attach the included diagram and ask you all: -- is this RFP asking (specifically) for an API? Is this a case where one makes sense? or -- is there an Open opportunity here if the winner of this bid builds an API as part of the project; and that API is made public? Quoting: > > The tool needs to be able to connect to multiple data sources, perform data mashups, provide advanced visualizations, accommodate geospatial analysis, and provide an easy to use interface for analysts, developers and end consumers. > Technical Specifications: > > Ease of Use: the tool’s overall ease of use for analysts, developers and consumers is paramount. Analysts and developers should not need to know SQL or any other proprietary languages to leverage the tool. Consumers should be able to interact easily with the data in a highly visual, easy to use interface. > Pull and Assemble Data: the tool will allow Analysts to access and join data from multiple data stores and combine external data in the form of flat files, spreadsheets or other formats. Analysts should be able to access a central repository containing frequently used joins, designs and objects and to access metadata captured from earlier explorations. > > Analyze and Visualize Data: the tool will provide advanced analysis capabilities and support advanced visualization techniques. The tool should be able to present geo-coded data on a visual map. Formatting options for layout and design must be flexible and easy to use. > Package and Distribute Data: the final output of the data exploration and visualization should be easy to package and distribute. Consumers should be able to interact with the output while connected to their local area network or when working disconnected. The output should be exportable to Excel, PowerPoint and other formats. > Mobile Capabilities: the tool should allow consumers who work in the field or the office to receive and interact with data on mobile devices including tablets and smart phones. > Training and Support: the training required to leverage the full capabilities of the tool should be manageable for Analysts and minimal for Consumers. Training should be clearly accessible through e-learning, online help and through local training partners if necessary. > Architecture and Integration: the tool should be able to integrate with the databases, cubes and semantic layers that are currently in production at WorkSafeBC and provide integration with SharePoint. The tool and its capabilities need to be embedded within our existing applications and web pages. The tool should be easy to deploy and scalable for enterprise use. The initial installation is estimated at approximately 500 users. Rhiannon @coppinr Vancouver, B.C. On Tue, Mar 20, 2012 at 11:41 AM, James McKinney <[hidden email]> wrote: > > +1 Herb, Kevin. Nik points out some good examples of cases in which bulk downloads aren't timely (real-time data), don't support the type of query a user wants, or are very large. However, most datasets are not like this. There will likely be more of these as open data grows, but I think they will always remain the exception. > > On 2012-03-20, at 1:34 PM, Kevin McArthur wrote: > > Yes, I have to agree with Herb here. I'm always going to want the data, no matter how large. > > My OpenMoonMap.org site is the perfect example... it has to rely on NASA's idea of what the web mapping should look like. It goes down all the time taking the site down with it. I can't use any of the off-the-shelf GIS tools with it because all we have is a basic WMS. I've been on a mission to get the source data for this, but so far no luck, and thus the site wont grow. > > That said, I do like an API in a few very specific and limited situations. > > Real-time data. GeoRSS feeds, live video feeds, etc. > Authority at point in time data. (Think a lien search, or some other similar process that requires authoritative clearance as of very specific point in time) > API's are also appropriate for creating incoming data streams -- for crowdsourced data input. (eg fish catch reporting would be a cool api) > > In almost all other scenarios, API's are generally a massive pain to work with, add a significant point of failure, and add little to no value in the open-data scenario. (We can transfer terabytes easily these days, so data size is really a non-starter)... Last year we even published a CSV->API converter public domain product, to stop folks chasing their tails on API creation. If some management process says you absolutely have to have an API for CSV type data, it should take about 5 minutes to setup, and no time/resources/money should be dedicated to this task; the underlying data should always be published. > > -- > > Kevin > > > > On 12-03-20 10:07 AM, Herb Lainchbury wrote: > > APIs are almost always unnecessary and IMHO should be avoided wherever possible by governments for several reasons. > > 1) they are expensive to create compared providing bulk datasets > > 2) they are expensive to maintain > > 3) they won't be used - I usually won't use they extra functionality a publisher's API offers over plain old downloads unless I absolutely have no other choice. Why? Because I always want to minimize coupling between systems. The less I know about how your system works the better. Leave a bulk file at the end of a URL and I know everything I need to know about how to get the data. I will almost always write an ETL to get the data into a platform that I know I can control and rely on. > > 4) they are almost always unnecessary. There are few datasets so large that I wouldn't want to grab them whole (Kevin's openmoonmap.org comes to mind). > > 5) they are too easily a diversion from releasing data.... it's human nature to get "busy" building something new rather than just doing what's required. As Ward Cunningham, inventor of the Wiki says, "Do the simplest thing that could possibly work". > > Good question Jury. Thanks for asking it. > > H > > > On Tue, Mar 20, 2012 at 5:56 AM, James McKinney <[hidden email]> wrote: >> >> I think raw/bulk downloads should be available as often as possible. APIs, as a means of distributing data, should be reserved for when the underlying data cannot be distributed, or if the API is performing some costly operation to generate its response (which not all developers may have the technical know-how to implement). >> >> On 2012-03-20, at 8:03 AM, Jury Konga wrote: >> >> > I found this to be an interesting article >> > http://www.peterkrantz.com/2012/publishing-open-data-api-design/ and I >> > know this group will have opinions. Looking forward to the >> > feedback :-) >> > >> > Cheers Jury >> > > > > -- > Herb Lainchbury > Dynamic Solutions Inc. > www.dynamic-solutions.com > http://twitter.com/herblainchbury > > WorkSafeBCMarch20120RFPimg.png (59K) Download Attachment |
Free forum by Nabble | Edit this page |