Subscribe: Untitled
http://www.freepatentsonline.com/rssfeed/rssapp707.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
based  content items  content  data  includes  information  items  method  model  page  query  search  set  system  user 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Untitled

Untitled





 



ACCESS CONTROL FOR A DOCUMENT MANAGEMENT AND COLLABORATION SYSTEM

Thu, 27 Oct 2016 08:00:00 EDT

A method and apparatus for controlling access to documents retained by a document management and collaboration system is disclosed. The document management and collaboration system may generate one or more suggested privileges associated with one or more users. An access control policy may specify whether system-generated user privileges may be enforced. If they are enforced, access to one or more document may be made subject to the generated privileges.



MANAGING DATA RECORDS

Thu, 27 Oct 2016 08:00:00 EDT

Data records may be managed in a relational database by monitoring, a record length for a first data record in a page of memory, an amount of free space in the page, and a page length. In response to receiving an operator command to replace the first data record with a second data record, a database management system may determine whether an estimated record length of a compressed second data record is outside of the amount of free space in the page. In response to determining the estimated record length of a compressed second data record is outside of the amount of free space in the page, the database management system may determine whether an estimated length of a compressed page is outside of the page length. In response to determining the estimated length of a compressed page is within the page length, the page may be compressed.



ITEM SHARING BASED ON INFORMATION BOUNDARY AND ACCESS CONTROL LIST SETTINGS

Thu, 27 Oct 2016 08:00:00 EDT

An item is shared based on an information boundary and access control settings. An application such as a document management application detects a selection of an information boundary to manage a sharing action associated with the item. The information boundary includes rules to define how the item is shared. A selection of an access control list is also detected to manage recipients who have an access to the item. The access control list allows a recipient in the list an ability to search and discover the item. In response to a detection of the sharing action to share the item, the information boundary and the access control list is applied to the item. The item is then shared based on the information boundary and the access control list through a link of the item transmitted to a recipient.



AUGMENTING THE DISPLAY OF DATA IN A TREE FORMAT

Thu, 27 Oct 2016 08:00:00 EDT

The method includes identifying a tree data structure. The method includes identifying one or more features in the identified tree data structure, wherein the one or more features comprise at least one of: a node of the tree data structure, an object of the tree data structure, an array of the tree data structure, an object property of the tree data structure, and a root of the tree data structure. The method includes determining whether one of the one or more identified features matches a feature that initiates execution of a rule, wherein the rule defines augmentations to the tree data structure based upon one or more features in the tree data structure. The method includes augmenting the identified tree data structure based upon the determined one or more matches of the one or more identified features and the feature that initiates execution of the rule.



FAST QUERYING OF SOCIAL NETWORK DATA

Thu, 27 Oct 2016 08:00:00 EDT

The disclosed embodiments provide a system for processing data. During operation, the system obtains a graph of a social network, wherein the graph includes a set of nodes representing users in the social network and a set of edges representing relationships between pairs of the users. Next, the system stores, on a single computer system, a static representation of the graph, wherein the static representation includes a first set of fixed-size blocks representing the nodes and the edges and a first index that maps a set of identifiers for the nodes and the edges to offsets of the first set of fixed-size blocks. The system then uses the static representation of the graph to process, by the single computer system, one or more queries of the graph.



METHODS AND SYSTEMS FOR TEAM SEARCHES IN A SOCIAL NETWORKING SERVICE

Thu, 27 Oct 2016 08:00:00 EDT

Techniques for team searches within a social graph are described. Consistent with some embodiments, a search request initiated by a searching member profile is received. The search request includes search criteria. A team membership connection between the searching member profile and a team profile is then detected. Based on the detected team membership connection, profile connections between member profiles and teammates of the searching member profile are identified. The teammates are member profiles with team membership connections to the team profile. Then, matching member profiles are identified by matching the member profiles with the identified profile connection with the search criteria. The matching member profiles are then communicated to the searching member profile.



GENERATING MOBILE-FRIENDLINESS SCORES FOR RESOURCES

Thu, 27 Oct 2016 08:00:00 EDT

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining a mobile-friendliness score for a resource. One of the methods includes obtaining data identifying a particular resource; submitting a request for the particular resource to a site hosting the particular resource; receiving, in response to the request, a resource document from the site hosting the particular resource; rendering the resource document; evaluating a result of the rendering to determine a respective signal score for each of one or more mobile-friendliness signals; computing a mobile-friendliness score for the particular resource from the one or more signal scores, wherein the mobile-friendliness score represents a degree to which the particular resource has been optimized to be viewed on a mobile device; and associating the mobile-friendliness score for the particular resource with the particular resource in an index.



INFORMATION RETRIEVAL METHOD, APPARATUS AND SYSTEM

Thu, 27 Oct 2016 08:00:00 EDT

The present disclosure discloses an information retrieval method, apparatus and system, which belong to the field of information processing. The method includes: receiving a retrieval line which is input in a graphic manner; detecting whether an information model matching with the retrieval line exists, each information model corresponding to an information model identifier; sending, if a detection result is that an information model matching with the retrieval line exists, an information model identifier of the information model to a server; and receiving at least one piece of information fed back by the server, the information being fed back after the server identifies a corresponding information model according to the information model identifier and then retrieves at least one piece of information matched with the information model according to the information model, and therefore the problems that a user needs to set condition parameters to construct an information model and the operation is complex are solved; the effects that a user can find information needed by the user without the need to set condition parameters to construct an information model, the operation complexity is reduced and the user does not need to have a good capability of constructing an information model are achieved.



METHOD AND SYSTEM FOR CHARACTERIZING A USER GROUP

Thu, 27 Oct 2016 08:00:00 EDT

The present invention refers to a method for characterizing a group of users, related among them by their mobile communication data, according to web navigation data. The method comprises: building a social graph from the mobile communication data of a user and his contacts; extracting web navigation data of each user of the social graph; associating to each edge linking two users of the social graph, web navigation data extracted for both two users; obtaining a measure of harmony based on comparing web navigation data of both two users; and providing a set of metrics for the group of users based on the measures of harmony, characterizing the group of users.



QUERY MEDIATOR, A METHOD OF QUERYING A POLYGLOT DATA TIER AND A COMPUTER PROGRAM EXECUATABLE TO CARRY OUT A METHOD OF QUERYING A POLYGLOT DATA TIER

Thu, 27 Oct 2016 08:00:00 EDT

A query mediator arranged to query a polyglot data tier of data stores, each data store adopting a data model and the polyglot data tier including at least two different types of data store with differing data models. The query mediator including at least one HTTP API; a catalogue containing metadata for each data store; and a plurality of adapters, one for each data model. The API receives an incoming query from a client, checks the query against the catalogue to identify a correct data store storing the queried data, and routes the query to an adapter for the correct data store. The adapter transforms the query into a format suitable for use with the data model adopted in the correct data store, for execution by the relevant data store. The API returns the query result to the client in response to the incoming query.



Systems and Methods for Verifying User Credentials for Search

Thu, 27 Oct 2016 08:00:00 EDT

Provided are systems and methods for verifying user credentials for performing a search. In one embodiment, a method can be provided that includes receiving a request to perform a search of machine generated data comprising time stamped events that is associated with a user, determining whether a set of cached user credentials has been updated within a period of time, querying, in response to determining that the credentials for the user have not been updated within the period of time, an identity provider server for a current set of user credentials associated with the user, receiving the current set of user credentials, determining whether the user has privileges to perform the search based at least in part on the set of user credentials, and causing, in response to determining that the user has privileges to perform the search, the search to be performed to identify one or more of the events that are responsive to the search.



SOCIAL CONTENT FEATURES BASED ON USER TRACKING

Thu, 27 Oct 2016 08:00:00 EDT

Prioritizing online comments on social network web page is disclosed. An activity of a user consuming a content presented on the social network web page is detected. Time spent by the user consuming the content is determined. Responsive to the user entering a comment on the content, a depth of consumption of the content by the user is determined based on the time spent by the user consuming the content. The comment entered by the user is ranked among a plurality of comments entered by one or more of the plurality of users based on the depth of consumption. The comment entered by the user is presented on the social network web page in the order of the ranking.



CONTENT DISTRIBUTION BASED ON ENTITY IDENTIFIERS

Thu, 27 Oct 2016 08:00:00 EDT

Techniques are provided for a particular party to distribute content provided by other parties. The particular party receives multiple textual content items from a data source provided by a different party. Furthermore, the particular party determines, for each textual content item, whether the textual content item is associated with metadata that includes an instance of an entity identifier in a set of entity identifiers maintained by the particular party. The instance is provided by the different party. Each entity identifier in the set of entity identifiers uniquely identifies a different respective entity of a plurality of entities and indicates whether the different respective entity is a subject of the textual content item. In response to determining that a textual content item is associated with metadata that includes an instance of an entity identifier in the set of entity identifiers, target users are determined based on the entity identifier.



ENHANCING SEARCH RESULT PAGES USING STRUCTURAL INFORMATION ABOUT THE STRUCTURE OF CONTENT FROM CONTENT PROVIDERS

Thu, 27 Oct 2016 08:00:00 EDT

A search engine provider interacts with a content provider wherein the content provider provides content to the search engine provider. The content may comprise information that indicates a structure of the content provider's web pages. The search engine may use structural information to classify and extract data items from web pages, and to highlight those data items in search results with labels that identify each such data item's class.



WEB PAGE RECOGNIZING METHOD AND APPARATUS

Thu, 27 Oct 2016 08:00:00 EDT

Disclosed is a web page recognizing method, which includes obtaining a weight of each segmented word of a web page to be recognized; acquiring, according to the weight of each segmented word of a web page to be recognized, weights of the web page to be recognized in two predetermined web page categories through calculation by using a logistic regression model established in advance; and taking a web page category having a greater weight as a category of the web page to be recognized. Also disclosed is a web page recognizing apparatus. The present disclosure recognizes a web page more accurately, especially for a web page in which key words are difficult to distinguish.



SEARCHING RESTRICTED CONTENT ON A NETWORK

Thu, 27 Oct 2016 08:00:00 EDT

A server may utilize a dialog engine and a search engine to extract key phrases from a received natural language query. A database may be searched for a set of webpages based on the extracted key phrases. A first response having a portion of the set of webpages may be transmitted. The server may continue, after the first response using a machine learning function without user search input, to search information associated with the received natural language query.



GENERATING A DISCOVERY PAGE DEPICTING ITEM ASPECTS

Thu, 27 Oct 2016 08:00:00 EDT

In various example embodiments, a system and method for generating a discovery page that depicts item aspects are presented. A query that includes an identifier of an item is received from a client device. A category of items that includes the item described in the query is identified using the identifier of the item. One or more aspects that correspond to a group of items included in the category of items are determined. Item listings for the group of items from the category of items are accessed. A discovery page that depicts the accessed item listings of the group of items in relation to the one or more aspects is generated. Display of the generated discovery page is caused.



CONTENT CONTRIBUTION VALIDATION

Thu, 27 Oct 2016 08:00:00 EDT

Embodiments of the invention provide a method, system and computer program product for content contribution validation. A content contribution validation method includes receiving in memory of a host computing system, from over a computer communications network, a content contribution to existing content stored in data storage coupled to the host computing system. The method also includes selecting a portion of textual terms the contribution and generating a search query utilizing the selected portion. The method yet further includes querying the existing content by a processor of the host computing system using the search query and receiving a result set from the search query. Finally, the method includes determining by the processor whether or not the result set exceeds a threshold match and applying the content contribution to the existing content in response to a determination that the result set exceeds the threshold match, but otherwise rejecting the content contribution.



METHODS AND SYSTEMS FOR PROVIDING SERENDIPITOUS RECOMMENDATIONS

Thu, 27 Oct 2016 08:00:00 EDT

Systems and methods are described herein for returning search results that may be rare or surprising to what a user would expect a search performed on a user-input symbol would yield. As an example, if a user were searching for media related to the television show “It's Always Sunny in Philadelphia” by entering the search symbol “It's Always Sunny in Philadelphia,” the user would typically expect the search results to yield media related to that television show. However, if the user's profile indicates that the user is fascinated by astronomy, for example, the search result may also yield a result corresponding to a documentary discussing the composition of the sun.



SYSTEM FOR LINKING DIVERSE DATA SYSTEMS

Thu, 27 Oct 2016 08:00:00 EDT

A system creates an abstraction layer surrounding a diverse data system including multiple different databases. Data is received from data sources and ingested into the various databases according to a core model. New instances of the core model are created and added to a larger linked data model (LDM) when new data sources are added to the system. The LDM captures the linkages between different linked data objects and links across different databases. Accordingly, applications are able to access or explore the linked data stored in different databases without prior knowledge of the linking relationships.



METHOD, APPARATUS, AND COMPUTER PROGRAM PRODUCT FOR CLASSIFICATION AND TAGGING OF TEXTUAL DATA

Thu, 27 Oct 2016 08:00:00 EDT

Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.



ATTRIBUTE-BASED CONTEXTS FOR SENTIMENT-TOPIC PAIRS

Thu, 27 Oct 2016 08:00:00 EDT

The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of content items and a set of topics in the set of content items. For each topic in the set of topics, the system automatically extracts a set of attributes that provides a context for the topic from a subset of the content items containing the topic. The system then displays the set of attributes in the context of the topic to improve understanding of the set of content items by the user without requiring the user to manually analyze the set of content items.



METHOD AND SYSTEM OF SEARCHING A PUBLIC ACCOUNT IN A SOCIAL NETWORKING APPLICATION

Thu, 27 Oct 2016 08:00:00 EDT

A method of searching a public account in a social networking application includes logging in to a social networking application using a user account and then searching a public account using a keyword in an account database of the social networking application. The method further includes acquiring a first list of public accounts that match the keyword; acquiring a second list of public accounts that friends of the user liked; and comparing the first list of public accounts with the second list of public accounts. If a public account of the first list also appears on the second list, the method includes retrieving information of friends who liked the public account, counting a number of friends who liked the public account, and displaying the public accounts in the first list with information of friends who liked the public accounts.



DATA RESOLUTION WITHIN SEARCH RESULTS FROM A HIERARCHICALLY ASSOCIATED DATABASE

Thu, 27 Oct 2016 08:00:00 EDT

A computerized method for multi-level data resolution based upon searches of the hierarchically organized elements comprises receiving a database query directed towards returning information from one or more locations within a hierarchically organized data structure. The method can also comprise generating a summary of the information that conforms with the query and a first filter condition. Additionally, the method can comprise receiving a request for a detailed view of the information in the summary. Upon receiving the request the method can include generating a detailed accounting of the information that conforms with the query and a second filter condition. Additionally, the second filter condition can be different from the first filter condition.



PROVIDING SEARCHING STRATEGY IN CONNECTION WITH ANSWERING QUESTION IN MESSAGE

Thu, 27 Oct 2016 08:00:00 EDT

A method, system and computer program product for providing a searching strategy in connection with answering a question in a message. A message containing a question is detected as being received from a sender. The steps performed by the recipient of the message to answer the question are monitored. Content is detected as being inserted in a reply message responding to the sender's message via a copy and paste operation. In response to detecting the copy and paste operation, the monitored steps utilized by the user in the user's searching strategy in arriving at an answer to the sender's question are stored in a database. The monitored steps are then attached to the reply message as tags or a link to the database to retrieve the stored monitored steps is inserted in the reply message thereby allowing the sender to replay the searching strategy used in answering the sender's question.



USING A GRAPH DATABASE TO MATCH ENTITIES BY EVALUATING BOOLEAN EXPRESSIONS

Thu, 27 Oct 2016 08:00:00 EDT

A method of matching a first entity to a second entity by evaluating Boolean expressions includes identifying a set of criteria vertices for a second entity vertex by traversing a graph database in a manner constrained to fact vertices identified for the second entity. The graph database relates fact vertices to the criteria vertices by edges corresponding to Boolean expressions for satisfying criteria for matching first entities to second entities. The method additionally includes selecting one of the first entities based on the criteria vertices of the set. The method further includes matching the first entity to the second entity based on the selection.



DETECTING AND COMBINING SYNONYMOUS TOPICS

Thu, 27 Oct 2016 08:00:00 EDT

The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of topics associated with a set of content items. Next, the system obtains a first set of attributes associated with a first topic in the set of topics and a second set of attributes associated with a second topic in the set of topics. Next, the system calculates a similarity between the first and second sets of attributes and applies a threshold to the similarity to identify the first and second topics as synonymous when the similarity exceeds a threshold. The system then merges the first and second topics under a representative topic. Finally, the system displays the representative topic to a user to improve understanding of the set of content items by the user without requiring the user to manually analyze the set of content items.



SPECULATIVE SEARCH RESULT ON A NOT-YET-SUBMITTED SEARCH QUERY

Thu, 27 Oct 2016 08:00:00 EDT

Providing a speculative search result for a search query prior to completion of the search query is described. In response to receiving a search query from a client node, a speculative search result is provided to the client node for the search query prior to receiving an indication from the client node that said search query is completely formed. The speculative search result may be displayed on the same web page on the client node as the search query, while the search query is being entered by the user. As the user further enters the search query, a new speculative search result may be provided to the user.



ESTIMATING PROBABILITY OF SPREADING INFORMATION BY USERS ON MICRO-WEBLOGS

Thu, 27 Oct 2016 08:00:00 EDT

Methods and systems for estimating a probability of re-sharing information include extracting keywords from a set of documents addressed to a user. The keywords from the set of documents are weighted according a metric for the user's interest in the keywords' respective source documents to create an interest model. A new document having one or more keywords is received. A likelihood that the user will re-share the new document is determined. The likelihood is based on the interest model and the one or more keywords present in the new document. The new document is automatically responded to based on the determined likelihood.



MANAGING INFORMATION ABOUT RELATIONSHIPS IN A SOCIAL NETWORK VIA A SOCIAL TIMELINE

Thu, 27 Oct 2016 08:00:00 EDT

A system, method, and computer program for generating a social timeline is provided. A plurality of data items associated with at least one relationship between users associated with a social network is received, each data item having an associated time. The data items are ordered according to the at least one relationship. A social timeline is generated according to the ordered data items.



TOPIC EXTRACTION USING CLAUSE SEGMENTATION AND HIGH-FREQUENCY WORDS

Thu, 27 Oct 2016 08:00:00 EDT

The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of clauses in a first set of content items comprising unstructured data. Next, the system obtains a set of stop words comprising high-frequency words that occur in a second set of content items. The system then automatically extracts a set of topics from the set of clauses by generating a set of n-grams from the set of clauses and excluding a first n-gram in the set of n-grams from the set of topics when the first n-gram contains a word in the set of stop words in a pre-specified position of the first n-gram. Finally, the system displays the set of topics to a user to improve understanding of the first set of content items by the user without requiring the user to manually analyze the first set of content items.



Computer-Implemented System And Method For Selecting Documents For Review

Thu, 27 Oct 2016 08:00:00 EDT

A computer-implemented system and method for selecting documents for review is provided. A master array of messages and topics for the messages is generated. The messages in the master array are sorted by the topics and the sorted messages are processed. During processing, each message in the master array is identified as unique, duplicate, or near duplicate. The unique messages are extracted from the duplicate and near duplicate messages, and entered into a log by creating a log entry for each of the unique messages. Each log entry includes a source of and identification information for one of the unique messages. The unique messages are then provided for document review.



CATEGORIZING HASH TAGS

Thu, 27 Oct 2016 08:00:00 EDT

A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.



SYSTEM AND METHOD FOR PROVIDING DIFFERENTIATED STORAGE SERVICE IN A DATABASE

Thu, 27 Oct 2016 08:00:00 EDT

In accordance with some embodiments, classification of input/output requests from a database to a storage system may be performed. Each input/output request may be associated with a database class, and each database class may be mapped to a quality of service policy. Thus, quality of service may be enforced such that different data blocks within the storage system of the database may be afforded appropriate quality of service.



SYSTEMS AND METHODS FOR PROVIDING A USER WITH A SET OF INTERACTIVITY FEATURES LOCALLY ON A USER DEVICE

Thu, 27 Oct 2016 08:00:00 EDT

Providing a user with an interactive user interface that may fully enable interactions regardless of connectivity status may be provided. In some embodiments, one or more content items may be selected by a user of the user devices and may be queued for upload to a content management system. The content items may be organized into one or more collections of content items with other content items already stored in a user account on the content management system, such as images having similar geo-temporal characteristics. A set of interactivity features may be available to the user for interacting with the queued content item. This may provide the user with the seamless feeling as if the content items have been uploaded to the content management system and the user may be interacting with the content items on the content management system, even if the upload has not been completed.



CUSTODIAN DISAMBIGUATION AND DATA MATCHING

Thu, 27 Oct 2016 08:00:00 EDT

Provided is a technique for matching different user representations of a person in a plurality of computer systems may be provided. The technique includes collecting information sets about user representations from a plurality of computer systems; normalizing the information sets to a unified format; grouping the information sets in the unified format into indexing buckets based on a user name using a non-phonetic algorithm; determining a similarity score for each pair of information sets in each of the indexing buckets; classifying each information set pair into a set of classes based on the similarity scores, wherein the set of classes comprise at least matches and non-matches; and using a data structure for merging information of information set pairs classified as matches.



IDENTIFYING EVENTS FROM AGGREGATED DEVICE SENSED PHYSICAL DATA

Thu, 27 Oct 2016 08:00:00 EDT

Aspects extend to methods, systems, and computer program products for predicting events from aggregated device sensed physical data. Aspects facilitate dynamically targeted collection and aggregation of physical metrics (e.g., body metrics and environmental metrics) from varying sensing devices. Aggregated data can be used for pattern analysis, reporting and predictive results on health related events (or other scenarios). Collected physical metric data can be anonymized or personalized based at least in part on data source. Pattern analysis can be used to report at different levels (e.g., personal or commercial, localized or global) and return relevant contextual driven results, including potential healthcare related events or other events relating to the study of changes that occur in large groups of people over a period of time (e.g., relating to demography).



CLASSIFYING DOCUMENTS BY CLUSTER

Thu, 27 Oct 2016 08:00:00 EDT

Methods, apparatus, systems, and computer-readable media are provided for classifying, or “labeling,” documents such as emails en masse based on association with a cluster/template. In various implementations, a corpus of documents may be grouped into a plurality of disjoint clusters of documents based on one or more shared content attributes. A classification distribution associated with a first cluster of the plurality of clusters may be determined based on classifications assigned to individual documents of the first cluster. A classification distribution associated with a second cluster of the plurality of clusters may then be determined based at least in part on the classification distribution associated with the first cluster and a relationship between the first and second clusters.



CUSTODIAN DISAMBIGUATION AND DATA MATCHING

Thu, 27 Oct 2016 08:00:00 EDT

Provided is a technique for matching different user representations of a person in a plurality of computer systems may be provided. The technique includes collecting information sets about user representations from a plurality of computer systems; normalizing the information sets to a unified format; grouping the information sets in the unified format into indexing buckets based on a user name using a non-phonetic algorithm; determining a similarity score for each pair of information sets in each of the indexing buckets; classifying each information set pair into a set of classes based on the similarity scores, wherein the set of classes comprise at least matches and non-matches; and using a data structure for merging information of information set pairs classified as matches.



CLUSTERING COMMUNICATIONS BASED ON CLASSIFICATION

Thu, 27 Oct 2016 08:00:00 EDT

Methods and apparatus related to clustering documents based on one or more classification terms and optionally based on similarity of structural paths of the documents. In some implementations, the documents are communications such as structured emails or other structured communications. In some of those implementations, clustering the communications includes identifying a plurality of classification terms indicative of a classification, identifying a corpus of communications that includes communications that are not labeled with an association to the classification, and determining a cluster of the communications based on occurrence of one or more of the classification terms in the communications of the cluster.



OLAP ENGINE WORKLOAD DISTRIBUTION USING COST BASED ALGORITHMS

Thu, 27 Oct 2016 08:00:00 EDT

One or more processors divide an OLAP cube into one or more cubelets. One or more processors determine a weight corresponding to each node present within each of the one or more cubelets. One or more processors determine a total cost corresponding to each of the one or more cubelets. One or more processors assign execution of a portion of a workload corresponding to each of the one or more cubelets to a data processing element. The assignment of the execution of the portion of the workload corresponding to a cubelet to a data processing element is based on the determined total cost corresponding to the cubelet.



SYSTEM OF DYNAMIC HIERARCHIES BASED ON A SEARCHABLE ENTITY MODEL

Thu, 27 Oct 2016 08:00:00 EDT

A system having dynamic hierarchies based on a searchable entity model. The present system may be used for defining hierarchies by defining each level based on the attributes of its objects. The multiple level structure may be specified according to level definitions, each one defining a mechanism for determining the members in an instantiation of the actual system hierarchy. Level definitions may incorporate lists, queries based on semantic tags and relationships, groupings based on semantic tags, and traversal of relationships. A feature may be that multiple hierarchies can be defined and the hierarchies can be updated automatically as the system is modified and objects are added, removed, or modified.



Crowd Sourced Data Sampling at the Crowd

Thu, 27 Oct 2016 08:00:00 EDT

An approach is provided for sampling crowd sourced data. The approach selects an sampling node from a set of crowd nodes. The sampling node receives a data acquisition request from a data collector and receives data from the set of crowd nodes with the data being responsive to the data acquisition request. The received data is processed by the sampling node to reduce redundant data as defined by the data acquisition request. An acquired data message block is generated and transmitted from the sampling node to the data collector.



METHOD AND APPARATUS FOR PROCESSING DATABASE DATA IN DISTRIBUTED DATABASE SYSTEM

Thu, 27 Oct 2016 08:00:00 EDT

A computer program product configured to implement a method for processing database data in a distributed database system, wherein the distributed database system comprises a plurality of computing nodes communicatively coupled via computer networks, the method comprising: creating a plurality of different data replicas wherein each of the data replicas is created in the following way: sorting the database data according to at least one data attribute; generating a row key based on the at least one data attribute; and using the sorted database data with the row key as the data replica, storing different data replicas in different computing nodes; and creating an index for each of the data replicas according to its row key.



METHOD AND APPARATUS OF MAINTAINING DATA FOR ONLINE ANALYTICAL PROCESSING IN A DATABASE SYSTEM

Thu, 27 Oct 2016 08:00:00 EDT

A method and an apparatus of maintaining data for online analytical processing in a database system. The method includes: tracking a changed page in a main process; and synchronizing the changed page to a child process for online analytical processing. In the method and apparatus of maintaining data for online analytical processing, the changed pages are tracked and then the child process is synchronized with the changed pages. Therefore, periodic forking is avoided, fork overhead due to periodic forking in the prior art is removed, the synchronization is faster since only the changed pages are synchronized, and the performance of online data processing is enhanced.



DISTRIBUTED BALANCED OPTIMIZATION FOR AN EXTRACT, TRANSFORM, AND LOAD (ETL) JOB

Thu, 27 Oct 2016 08:00:00 EDT

Provided are techniques for distributed balanced optimization for an Extract, Transform, and Load (ETL) job across distributed systems of participating ETL servers. A data flow graph with links and stages for an ETL job to be executed by participating ETL servers is received. A distributed job execution plan is generated that breaks the data flow graph into job segments that each include a subset of the links and stages and map to one participating ETL server from the distributed systems to meet an optimization criteria across the distributed systems, wherein the distributed job execution plan utilizes statistics to reduce data movement and redundancies and to balance workloads across the distributed systems. Each of the job segment is distributed to the participating ETL servers based on the mappings for parallel execution.



DISTRIBUTED BALANCED OPTIMIZATION FOR AN EXTRACT, TRANSFORM, AND LOAD (ETL) JOB

Thu, 27 Oct 2016 08:00:00 EDT

Provided are techniques for distributed balanced optimization for an Extract, Transform, and Load (ETL) job across distributed systems of participating ETL servers. A data flow graph with links and stages for an ETL job to be executed by participating ETL servers is received. A distributed job execution plan is generated that breaks the data flow graph into job segments that each include a subset of the links and stages and map to one participating ETL server from the distributed systems to meet an optimization criteria across the distributed systems, wherein the distributed job execution plan utilizes statistics to reduce data movement and redundancies and to balance workloads across the distributed systems. Each of the job segment is distributed to the participating ETL servers based on the mappings for parallel execution.



DATA MINING METHOD

Thu, 27 Oct 2016 08:00:00 EDT

The present invention proposes a method for data mining, the method comprising: making statistics of the feature vectors of each target object according to the records in a target data set so as to constitute a rough data set, each of the feature vectors including the value of at least one attribute data of the target objects corresponding thereto; screening the feature vectors which correspond to all known the first type of target objects from the rough data set, and performing a filter operation onto the screened feature vectors to obtain samples; and building a regression model based on the samples, and then using the built regression model to determine whether each of all known the second type of target objects potentially belongs to the first type of target objects. The method for data mining disclosed in the present invention is capable of mining and classifying the target objects according to the comprehensive features of the target objects.



LOW-LATENCY QUERY PROCESSOR

Thu, 27 Oct 2016 08:00:00 EDT

Techniques for implementing a low-latency query processor accommodating an arbitrary number of data rows with no column indexing. In an aspect, data is stored across a plurality of component databases, with no requirement to strictly allocate data to partitions based on row keys. A histogram table is provided to map object relationships identified in a user query to the component databases where relevant data is stored. A server processing the user query communicates with component databases via an intermediary module. The intermediary module may include intermediary nodes dynamically assigned to connect to the component databases to retrieve and process the queried data.



External Linking Based On Hierarchical Level Weightings

Thu, 27 Oct 2016 08:00:00 EDT

Certain implementations of the disclosed technology include systems and methods for external linking based on hierarchal level weightings. The method may include associating external query data having one or more query field values with a record in a linked hierarchical database. The linked hierarchical database may include a plurality of records, each record having a record identifier and representing an entity in a hierarchy, each record associated with a hierarchy level, each record including one or more fields, each field configured to contain a field value. The associating may include receiving the external query data, wherein the external query data includes one or more search values; and identifying, from the plurality of records in the linked hierarchical database, one or more matched fields having field values that at least partially match the one or more search values.