19 May 2015

17 May 2015

talk: Amit Sheth on Transforming Big data into Smart Data, 11a Tue 5/26

Tweet Transforming big data into smart data: deriving value via harnessing volume, variety and velocity using semantics and semantic web Professor Amit Sheth Wright State University 11:00am Tuesday, 26 May 2015, ITE 325, UMBC Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma ...

AKSW Colloquium, 18-05-2015, Multilingual Morpheme Ontology, Personalised Access and Enrichment of Linked Data Resources

MMoOn - A Multilingual Morpheme Ontology by Bettina Klimek In the last years a rapid emergence of lexical resources evolved in the Semantic Web. Whereas most of the linguistic information is already machine-readable, we found that morphological information is either absent or only contained in semi-structured strings. While a plethora of linguistic resources for the lexical domain already exist and are highly reused, there is still a great gap for equivalent morphological datasets and ontologies. In order to enable the capturing of the semantics of expressions beneath the word-level, I will present a Multilingual Morpheme Ontology called MMoOn. It is ...

15 May 2015

DC-2015 registration is now open

2015-05-15, Online registration for DC-2015 is now open at http://dcevents.dublincore.org/IntConf/index/pages/view/reg15 . The conference and DCMI Annual Meeting is scheduled for 1-4 September in São Paulo, Brazil. This year's theme is "Metadata and Ubiquitous Access to Culture, Science and Digital Humanities". The need for structured metadata to support ubiquitous access across the Web to the treasure troves of resources spanning cultures, in science, and in the digital humanities is now common knowledge among information systems designers and implementers. Structured metadata expressed through languages of description make it possible for us to 'speak' about the contents of our treasure troves. But, like ...

14 May 2015

SNB Interactive, Part 2 - Modeling Choices

SNB Interactive is the wild frontier, with very few rules. This is necessary, among other reasons, because there is no standard property graph data model, and because the contestants support a broad mix of programming models, ranging from in-process APIs to declarative query. In the case of Virtuoso , we have played with SQL and SPARQL implementations. For a fixed schema and well known workload, SQL will always win. The reason is that SQL allows materialization of multi-part indices and data orderings that make sense for the application. In other words, there is transparency into physical design. An RDF/SPARQL-based application ...

SNB Interactive, Part 1 - What is SNB Interactive Really About?

This is the first in a series of blog posts analyzing the Interactive workload of the LDBC Social Network Benchmark . This is written from the dual perspective of participating in the benchmark design, and of building the OpenLink Virtuoso implementation of same. With two implementations of SNB Interactive at four different scales, we can take a first look at what the benchmark is really about. The hallmark of a benchmark implementation is that its performance characteristics are understood; even if these do not represent the maximum of the attainable, there are no glaring mistakes; and the implementation represents a ...

13 May 2015

Schema.org 2.0 We are pleased to announce the public release of Schema.org 2.0 which brings several significant changes and additions, not just to the vocabulary, but also to how we grow and manage it, from both technical and governance perspectives. As schema.org adoption has grown, a number groups with more specialized vocabularies have expressed interest in extending schema.org with their terms. Examples of this include real estate, product, finance, medical and bibliographic information. Even in something as common as human names, there are groups interested in creating the vocabulary for representing all the intricacies of names. Groups that have a ...

12 May 2015

The Perfect Storm for Data

Mike Atkin of the EDM Council speaks eloquently about the "perfect storm" for data in Financial Services. Two converging forces, regulatory reporting requirements and the need for customer insight, are placing unprecedented demands on the data infrastructure in most financial institutions.

Clare Grasso: Information Extraction from Dirty Notes for Clinical Decision Support

Tweet Information Extraction from Dirty Notes for Clinical Decision Support Clare Grasso 10:00am Tuesday, 12 May 2015, ITE346 The term clinical decision support refers broadly to providing clinicians or patients with computer-generated clinical knowledge and patient-related information, intelligently filtered or presented at appropriate times, to enhance patient care. It is estimated that at least 50% of the clinical information describing a patient's current condition and stage of therapy resides in the free-form text portions of the Electronic Health Record (EHR). Both linguistic and statistical natural language processing (NLP) models assume the presence of a formal underlying grammar in the text. ...

10 May 2015

AKSW Colloquium, 11-05-2015, DBpedia distributed extraction framework

Scaling up the DBpedia extraction framework by Nilesh Chakraborty The DBpedia extraction framework extracts different kinds of structured information from Wikipedia to generate various datasets. Performing a full extraction of Wikipedia dumps of all languages (or even just the mapping-based languages) takes a significant amount of time. The distributed extraction framework runs the extraction on top of Apache Spark so that users can leverage multi-core machines or a distributed cluster of commodity machines to perform faster extraction. For example, performing extraction of the 30-40 mapping based languages on a machine with a quad-core CPU and 16G RAM takes about 36 hours. Running the distributed framework in the same setting ...

08 May 2015

Invited talk @AIMS webinar series

On 5th of May Ivan Ermilov on behalf of AKSW presented CKAN data catalog as a part of AIMS (Agricultural Information Management Standards) webinar series. The recording and the slides of the webinar " CKAN as an open-source data management solution for open data " are available on the AIMS web portal: http://aims.fao.org/capacity-development/webinars/ckan-open-source-data-management-solution-open-data AIMS organizes free and open to everyone webinars on various topics. You can find more recordings and material on AIMS webpage, YouTube channel and Slideshare: Main page of Webinars@AIMS : http://aims.fao.org/capacity-development/webinars YouTube : http://www.youtube.com/user/FAOAIMSVideos Slideshare : http://www.slideshare.net/faoaims/ckan-as-an-opensource-data-management-solution-for-open-data

07 May 2015

Understanding Smart Data Integration in just 2 minutes

Data integration projects can be time consuming, expensive and difficult to manage.Traditional data integration methods require point to point mapping of source and target systems. This effort typically requires a team of both business SMEs and technology professionals. These mappings are time consuming to create and code and errors in the ETL (Extract, Transform, and Load) process require iterative cycles through the process.

06 May 2015

Collation Sequences in SPARQL

The SPARQL query language is relatively silent about how to order strings. When the question was posed to us a while back, what to expect as the order of a solution sequence which contained string literals with language tags, we had just the conservative answer that the relation among simple or string literals and plain literals was undefined. This was not a nice situation. Even though RDF 1.1 ratifies the type rdf:langString , it defines no relation beyond equality, which leaves plfn:compare to apply but requires some context where it is possible to determine the collation sequence. This situation is ...

05 May 2015

Thoughts on KOS (Part 3): Trends in knowledge organization

The accelerating pace of change in the economic, legal and social environment combined with tendencies towards increased decentralization of organizational structures have had a profound impact on the way we organize and utilize and organize knowledge. The internet as we know it today and especially the World Wide Web as the multimodal interface for the presentation and consumption of multimedia information are the most prominent examples of these developments. To illustrate the impact of new communication technologies on information practices Saumure & Shiri (2008) conducted a survey on knowledge organization trends in the Library and Information Sciences before and after ...

04 May 2015

DCMI Webinar: Digital Preservation Metadata and Improvements to PREMIS in Version 3.0

2015-05-04, This webinar with Angela Dappert on 27 May gives a brief overview of why digital preservation metadata is needed, shows examples of digital preservation metadata, shows how PREMIS can be used to capture this metadata, and illustrates some of the changes that will be available in version 3.0. The PREMIS Data Dictionary for Preservation Metadata is the international standard for metadata to support the preservation of digital objects and ensure their long-term usability. Developed by an international team of experts, PREMIS is implemented in digital preservation projects around the world, and support for PREMIS is incorporated into a number ...

03 May 2015

SPARQL: the video

Well, a video, but a lot of important SPARQL basics in a short period of time.

